Function/Class Design Guidelines - oop

I asked a fellow colleague yesterday if a function has too many parameters whether it would be better to create a class with properties instead. Are there any guidelines that I can follow?

I think that might depend on the language you are using, and the number of parameters in question, and are some of them allowed to be left out when calling the function.
VB has optional params, and C#3+ will allow for instantiation using params.
Will the new class have any other use except running that function, or will the state of that class be relevant later in the code?

When the number of parameters exceeds 5 I usually start thinking about refactoring the method. There's no absolute number, but this is my general rule. It may make sense to group the data in a data class, or sometimes it means that I should move the method to be closer to the data.

It all depends on the context.
Eg.
If it is not a database operations we can do as per the design of the system. Break the module and try to create sub modules.
If it is a database system I always prefer to write a separate bean class for the Fields and DAO class for operations.

Related

Type pool or class of constants?

What is the difference between Type-pool and creating a class for constants?
What is better?
My question is for a large group of constants and to be accessible to other groups.
Thank you
EDIT - Thank you for the answers and I will improve my question. I need something to store constants and I will use them on programs or other classes. Basically, I wanted to know if it is better to use a type-pool or a class with constants (only). I can have more than one class or type-pool.
The documentation mentions this:
Since it is possible to also define data types and constants in the public visibility section of global classes, type groups are obsolete and should no longer be created. Existing type groups can still be used.
A sensibly named interface with the constants you desire is the way to go. An additional benefit is that ABAP OO enforces some more rules.
Agree with #petul's answer, except for one detail: I'd recommend creating one enumeration-like class per logical group of constants, instead of collecting constants in interfaces.
Consider using the new enum language feature for specifying the constant values.
Interfaces can be accidentally "implemented", which doesn't make sense here. Classes can prevent this with final.
Making one class per logical group simplifies finding the constants with IDE features such as Ctrl+Shift+A search in the ABAP Development Tools. Constants that are randomly thrown together into interfaces are hard to find later on.
Classes allow adding enumeration-like helper methods like converters, existence checks, numbering all values.
Classes also allow adding unit tests, such as ensuring that the constant collection is still in sync with the fixed values of an underlying domain.

Downsides about using interface as parameter and return type in OOP

This is a question independent from languages.
Conceptually, it's good to code for interfaces(contracts) instead of specific implementations. I've got no problem understanding merits about the practice.
However, when I really code in that practice, the users of my classes, from time to time need to cast the interfaces for specific needs of specific functions provided by specific classes that implement that interface.
I understand there must be something wrong, either on my side or on the user's side, as the interface should expose all methods/properties(in the case of c#) that can possibly be necessary.
The code base is huge, and the users are clients.
It won't be particularly easy to make changes on either side.
That makes me wonder some downsides about using interface as parameter and return type.
Can people please list demerits of the practice? And please, include any solution if you know how to work around it.
Thanks a lot for enlightening me.
EDIT:
To be a bit more specific:
Assume we have a class called DbInfoExtractor. It has a public method GetInfo, as follows:
public IInformation GetInfo(IInfoParam);
where IInformation is an interface implemented by specific classes like VideoInfo, AudioInfo, TextInfo, etc; IInfoParam is an interface implemented by specific classes like VidoeInfoParam, AudioInfoParam, TextInfoParam, etc;
Apparently, depending on the specific object passed into the method GetInfo, the DbInfoExtractor needs to take different actions, as it is reasonable to assume that for different types of information, the extractor considers different sets of aspects(e.g. {size, title, date} for video, {title, author} for text information, etc) as search keys and search for relevant information in different ways.
Here, I see two options to go on:
1, using if ... else ... to decide what actually to take depending on the type of the parameter the GetInfo method receives. This is certainly bad, as avoiding this situation is one the very reasons we use polymorphism.
2, We should call IInfoParam.TakeAction(), and each specific implementation of IInfoParam has its own TakeAction() method to actually search and find the corresponding information from the database.
This options seems better, but still quite bad, as it shouldn't be the parameter that takes action searching and finding the information; it should be the responsibility of DbInfoExtractor.
So how can I delegate the TakeAction back to DbInfoExtractor? (I actually wrote some code to do this, but it's neither standard nor elegant. Basically I make parameter classes nested classes in DbInfoExtractor, so that they can call various versions of TakeAction of DbInfoExtractor.)
Please enlighten me!
Thanks.
Thanks.
Why not
public IVideoInformation GetVideoInformation(VideoQuery);
public IAudioInformation GetAudioInformation(AudioQuery);
// etc.
It doesn't look like there's a need for polymorphism here.
The query types are Query Objects, if you need those. They probably don't need to be interfaces; they know nothing about the database. A simple list of parameters (maybe just ID) might be sufficient.
The question is what does the client have, and what do they want? That's your interface.
Switch statements and casting are a smell, and typically mean that you've violated the Liskov substitution principle.

Naming convention and structure for utility classes and methods

Do you have any input on how to organize and name utility classes?
Whenever I run in to some code-duplication, could be just a couple of code lines, I move them to a utility class.
After a while, I tend to get a lot of small static classes, usually with only one method, which I usualy put in a utility namespace that gets bloated with classes.
Examples:
ParseCommaSeparatedIntegersFromString( string )
CreateCommaSeparatedStringFromIntegers( int[] )
CleanHtmlTags( string )
GetListOfIdsFromCollectionOfX( CollectionX )
CompressByteData( byte[] )
Usually, naming conventions tell you to name your class as a Noun. I often end up with a lot of classes like HtmlHelper, CompressHelper but they aren't very informative. I've also tried being really specific like HtmlTagCleaner, which usualy ends up with one class per utility method.
Have you any ideas on how to name and group these helper methods?
I believe there is a continuum of complexity, therefore corresponding organizations. Examples follow, choose depending of the complexity of your project and your utilities, and adapt to other constraints :
One class (called Helper), with a few methods
One package (called helper), with a few classes (called XXXHelper), each class with a few methods.
Alternatively, the classes may be split in several non-helper packages if they fit.
One project (called helper), with a few packages (called XXX), each package with ...
Alternatively, the packages can be split in several non-helper packages if they fit.
Several helper projects (split by tier, by library in use or otherwise)...
At each grouping level (package, class) :
the common part of the meaning is the name of the grouping name
inner codes don't need that meaning anymore (so their name is shorter, more focused, and doesn't need abbreviations, it uses full names).
For projects, I usually repeat the common meaning in a superpackage name. Although not my prefered choice in theory, I don't see in my IDE (Eclipse) from which project a class is imported, so I need the information repeated. The project is actually only used as :
a shipping unit : some deliverables or products will have the jar, those that don't need it won't),
to express dependencies : for example, a business project have no dependency on web tier helpers ; having expressed that in projects dependencies, we made an improvement in apparent complexity, good for us ; or finding such a dependency, we know something is wrong, and start to investigate... ; also, by reducing the dependencies, we may accelerate compilation and building ....
to categorize the code, to find it faster : only when it's huge, I'm talking about thousands of classes in the project
Please note that all the above applies to dynamic methods as well, not only static ones.
It's actually our good practices for all our code.
Now that I tried to answer your question (although in a broad way), let me add another thought
(I know you didn't ask for that).
Static methods (except those using static class members) work without context, all data have to be passed as parameters. We all know that, in OO code, this is not the preferred way. In theory, we should look for the object most relevant to the method, and move that method on that object. Remember that code sharing doesn't have to be static, it only has to be public (or otherwise visible).
Examples of where to move a static method :
If there is only one parameter, to that parameter.
If there are several parameters, choose between moving the method on :
the parameter that is used most : the one with several fields or methods used, or used by conditionals (ideally, some conditionnals would be removed by subclasses overriding) ...
one existing object that has already good access to several of the parameters.
build a new class for that need
Although this method moving may seem for OO-purist, we find this actually helps us in the long run (and it proves invaluable when we want to subclass it, to alter an algorithm). Eclipse moves a method in less than a minute (with all verifications), and we gain so much more than a minute when we look for some code, or when we don't code again a method that was coded already.
Limitations : some classes can't be extended, usually because they are out of control (JDK, libraries ...). I believe this is the real helper justification, when you need to put a method on a class that you can't change.
Our good practice then is to name the helper with the name of the class to extend, with Helper suffix. (StringHelper, DateHelper). This close matching between the class where we would like the code to be and the Helper helps us find those method in a few seconds, even without knowledge if someone else in our project wrote that method or not.
Helper suffix is a good convention, since it is used in other languages (at least in Java, IIRC rails use it).
The intent of your helper should be transported by the method name, and use the class only as placeholder. For example ParseCommaSeparatedIntegersFromString is a bad name for a couple of reasons:
too long, really
it is redundant, in a statically typed language you can remove FromString suffix since it is deduced from signature
What do you think about:
CSVHelper.parse(String)
CSVHelper.create(int[])
HTMLHelper.clean(String)
...

Design question: pass the fields you use or pass the object?

I often see two conflicting strategies for method interfaces, loosely summarized as follows:
// Form 1: Pass in an object.
double calculateTaxesOwed(TaxForm f) { ... }
// Form 2: Pass in the fields you'll use.
double calculateTaxesOwed(double taxRate, double income) { ... }
// use of form 1:
TaxForm f = ...
double payment = calculateTaxesOwed(f);
// use of form 2:
TaxForm f = ...
double payment = calculateTaxesOwed(f.getTaxRate(), f.getIncome());
I've seen advocates for the second form, particularly in dynamic languages where it may be harder to evaluate what fields are being used.
However, I much prefer the first form: it's shorter, there is less room for error, and if the definition of the object changes later you won't necessarily need to update method signatures, perhaps just change how you work with the object inside the method.
Is there a compelling general case for either form? Are there clear examples of when you should use the second form over the first? Are there SOLID or other OOP principles I can point to to justify my decision to use one form over the other? Do any of the above answers change if you're using a dynamic language?
In all honesty it depends on the method in question.
If the method makes sense without the object, then the second form is easier to re-use and removes a coupling between the two classes.
If the method relies on the object then fair enough pass the object.
There is probably a good argument for a third form where you pass an interface designed to work with that method. Gives you the clarity of the first form with the flexibility of the second.
It depends on the intention of your method.
If the method is designed to work specifically with that object and only that object, pass the object. It makes for a nice encapsulation.
But, if the method is more general purpose, you will probably want to pass the parameters individually. That way, the method is more likely to be reused when the information is coming from another source (i.e. different types of objects or other derived data).
I strongly recommend the second solution - calculateTaxesOwed() calculates some data, hence needs some numerical input. The method has absolutly nothing to do with the user interface and should in turn not consum a form as input, because you want your business logic separated from your user interface.
The method performing the calculation should (usualy) not even belong to the same modul as the user interface. In this case you get a circular dependency because the user interface requires the business logic and the business logic requires the user interface form - a very strong indication that something is wrong (but could be still solved using interface based programming).
UPDATE
If the tax form is not a user interface form, things change a bit. In this case I suggest to expose the value using a instance method GetOwedTaxes() or instance property OwedTaxes of the TaxForm class but I would not use a static method. If the calculation can be reused elsewhere, one could still create a static helper method consuming the values, not the form, and call this helper method from within the instance method or property.
I don't think it really matters. You open yourself to side effects if you pass in the Object as it might be mutated. This might however be what you want. To mitigate this (and to aid testing) you are probably better passing the interface rather than the concrete type. The benefit is that you don't need to change the method signature if you want to access another field of the Object.
Passing all the parameters makes it clearer what the type needs, and might make it easier to test (though if you use the interface this is less of a benefit). But you will have more refactoring.
Judge each situation on its merits and pick the least painful.
Passing just the arguments can be easier to unit test, as you don't need to mock up entire objects full of data just to test functionality that is essentially just static calculation. If there are just two fields being used, of the object's many, I'd lean towards just passing those fields, all else being equal.
That said, when you end up with six, seven or more fields, it's time to consider passing either the whole object or a subset of the fields in a "payload" class (or struct/dictionary, depending on the language's style). Long method signatures are usually confusing.
The other option is to make it a class method, so you don't have to pass anything. It's less convenient to test, but worth considering when your method is only ever used on a TaxForm object's data.
I realize that this is largely an artifact of the example used and so it may not apply in many real-world cases, but, if the function is tied so strongly to a specific class, then shouldn't it be:
double payment = f.calculateTaxesOwed;
It seems more appropriate to me that a tax document would carry the responsibility itself for calculating the relevant taxes rather than having that responsibility fall onto a utility function, particularly given that different tax forms tend to use different tax tables or calculation methods.
One advantage of the first form is
Abstraction - programming to an interface rather than implementation. It makes the maintainance of your code easier in the long run becuase you may change the implementation of TaxForm without affecting the client code as long as the interface of TaxForm does not change.
This is the same as the "Introduce Parameter Object" from Martin Fowler's book on refactoring. Fowler suggests that you perform this refactoring if there are a group of parameters that tend to be passed together.
If you believe in the Law of Demeter, then you would favor passing exactly what is needed:
http://en.wikipedia.org/wiki/Law_of_Demeter
http://www.c2.com/cgi/wiki?LawOfDemeter
Separation of UI and Data to be manipulated
In your case, you are missing an intermediate class, say, TaxInfo, representing the entity to be taxed. The reason is that UI (the form) and business logic (how tax rate is calculated) are on two different "change tracks", one changes with presentation technology ("the web", "The web 2.0", "WPF", ...), the other changes with legalese. Define a clear interface between them.
General discussion, using an example:
Consider a function to create a bitmap for a business card. Is the purpose of the function
(1) // Formats a business card title from first name and last name
OR
(2) // Formats a businnes card title from a Person record
The first option is more generic, with a weaker coupling, which is generally preferrable. However, In many cases less robust against change requests - e.g. consider "case 2017: add persons Initial to business card".
Changing the implementation (adding person.Initial) is usually easier and faster than changing the interface.
The choice is ultimately what type of changes you expect: is it more likely that more information from a Personrecord is required, or is it more likely that you want to create business card titles for other data structures than Person?
If that is "undecided", anfd you can't opf for purpose (1) or (2) I'd rather go with (2), for syntactic cleanliness.
If I was made to choose one of the two, I'd always go with the second one - what if you find that you (for whatever reason) need to caculate the taxes owed, but you dont have an instance of TaxForm?
This is a fairly trivial example, however I've seen cases where a method doing a relatively simple task had complex inputs which were difficult to create, making the method far more difficult to use than it should have been. (The author simply hadn't considered that other people might want to use that method!)
Personally, to make the code more readable, I would probbaly have both:
double calculateTaxesOwed(TaxForm f)
{
return calculateTaxesOwed(f.getTaxRate(), f.getIncome());
}
double calculateTaxesOwed(double taxRate, double income) { ... }
My rule of thumb is to wherever possible have a method that takes exactly the input it needs - its very easy to write wrapper methods.
Personally, I'll go with #2 since it's much more clear of what it is that the method need. Passing the TaxForm (if it is what I think it is, like a Windows Form) is sort of smelly and make me cringe a little (>_<).
I'd use the first variation only if you are passing a DTO specific to the calculation, like IncomeTaxCalculationInfo object which will contain the TaxRate and Income and whatever else needed to calculate the final result in the method, but never something like a Windows / Web Form.

How do you fight growing parameter list in class hierarchy?

I have a strong feeling that I do not know what pattern or particular language technique use in this situation.
So, the question itself is how to manage the growing parameter list in class hierarchy in language that has OOP support? I mean if for root class in the hierarchy you have, let's say 3 or 4 parameters, then in it's derived class you need to call base constructor and pass additional parameters for derived part of the object, and so forth... Parameter lists become enormous even if you have depth of inheritance more than two.
I`m pretty sure that many of SOwers faced this problem. And I am interested in ways how to solve it. Many thanks in advance.
Constructors with long parameter lists is an indication that your class is trying to do too much. One approach to resolving that problem is to break it apart, and use a "coordinator" class to manage the pieces. Subclasses that have constructor parameter lists that differ significantly from their superclass is another example of a class doing too much. If a subclass truly is-a superclass, then it shouldn't require significantly more data to do its job.
That said, there are occasional cases where a class needs to work on a large number of related objects. In this situation, I would create a new object to hold the related parameters.
Alternatives:
Use setter injection instead of constructor injection
Encapsulate the parameters in a separate container class, and pass that between constructors instead.
Don't use constructors to initialize the whole object at once. Only have it initialize those things which (1) are absolutely required for the existence of the object and (2) which must be done immediately at its creation. This will dramatically reduce the number of parameters you have to pass (likely to zero).
For a typical hierarchy like SalariedEmployee >> Employee >> Person you will have getters and setters to retrieve and change the various properties of the object.
Seeing the code would help me suggest a solution..
However long parameter lists are a code-smell, so I'd take a careful look at the design which requires this. The suggested refactorings to counter this are
Introduce Parameter Object
Preserve Whole Object
However if you find that you absolutely need this and a long inheritance chain, consider using a hash / property bag like object as the sole parameter
public MyClass(PropertyBag configSettings)
{
// each class extracts properties it needs and applies them
m_Setting1 = configSettings["Setting1"];
}
Possibilities:
Perhaps your class(es) are doing too much if they require so much state to be provided up-front? Aim to adhere to the Single Responsibility Principle.
Perhaps some of these parameters should logically exist in a value object of their own that is itself passed in as a parameter?
For classes whose construction really is complex, consider using the builder or factory pattern to instantiate these objects in a readable way - unlike method names, constructor parameters lack the ability to self document.
Another tip: Keep your class hierarchy shallow and prefer composition to inheritence. That way your constructor parameter list will remain short.