Explicit API methods vs. generalised parameter-based API methods - api

When defining a customer-accessible API, what is the preferred industry practice between the following:
a) Defining a set of explicit API methods, each with a very narrow and specific purpose, for example:
SetUserName <name>
SetUserAge <age>
SetUserAddress <address>
b) Defining a set of more generalised parameter-based API methods, for example:
SetUserAttribute <attribute>
enum attribute {
name,
age,
address
}
My opinion:
In favour of (a)
For boolean-based methods (e.g. EnableFoo) I would definely favour option (a) as the intentions are much more clear, it's less likely to require extensions in the future, and it makes more readable code.
For example, a method called EnableDisableFoo which takes a boolean parameter indicating whether to enable or disable would not be very clear, nor have a cohesive purpose.
It's where there are multiple options that the problem gets more complicated.
In favour of (b)
Option (b) is a great way of providing extensibility in the API, but at the expense of usability. With option (a), the API method name itself gives enough information to indicate what it is doing. With option (b), the user has to look up both the method name and the appropriate enumeration/parameter to use. In theory this makes option (b) worse from a usability standpoint -- but maybe having less methods is a good thing, so even this isn't completely true.
Other thoughts
It's necessary to strike a good balance between usability and extensibility, and they are often at odds with each other. But I'd like to think there is a more objective way to analyse this, rather than relying on the opinion of the API designer.
Does anyone have any thoughts on this?

I would personally argue for (a), since our goal is to make the "static" code as accurate and reliable as possible.
By using the generalized form, we are introducing a risk for runtime errors. For example, I could set an attribute of type age with a value that is actually a string, etc.
This is very similar to the argument for defining and using enums or explicit types rather than using and returning ints in the old C style, as you get one more level of assurance.
While I agree that (b) allows extensibility, I have not seen too many APIs that would require this sort of extensibility for completely different types of attributes. The only common use of (b) is in polymorphic code, where the function could technically accept anything, including extensions.

Another consideration is whether you want to set all attributes, and to set them simultaneously. For example, when you want to send something to a printer there may be dozens of parameters to be set (landscape or portrait; number of copies; page size; resolution; etc.). Instead of defining an API which needs to be invoked dozens of times, you can define a single function, which takes a struct as a parameter, where the struct contains dozens of fields, and where the caller initializes the struct at its leisure, and and then passes the struct in to the API in a single function call.

I think it depends on the code that you're writing. If you're writing about stuff that always goes together (i.e., if you're going to use/change age always with name), then go for b, otherwise a is fine.
But don't try to over-do (a) because then you're just going to write a lot more lines and get a lot less done. Good idea if you're paid for the amount of code you write though :)

Related

Whats the purpose of using method overloading?

I want to know the exact reason why the method overloading is done in OOP without using different method names to every variation as it was asked at an interview. Please help me to understand this concept.
Without using any fancy terms, let's say you're building an API, and there's a method called crush which let's say crushes or destroys whatever parameter is given to it. If you follow your way, you'll have to use atleast three different methods, each for an int, float and char (I'm using the most general types as an example). Now the more types there are, the more methods you'll have to create with that so many different names. Therefore the developer using your API, is going to have to remember so many different names for something as simple as a method that destroys its parameter. As much as it's difficult, it's also much less readable because again, remembering too many names for a singular function (function as in job).
Method overloading isn't used for everything, it's intended to be used for methods or functions that might take different types of data at different points, but internally follow a constant procedure or does a singular thing no matter what type of data it's passed.
You won't be writing one version of print that takes an int as a parameter, and returns the modulus of that, and another version of print that takes a string as an argument, and prints that to stdout. You can, but that's not how it's meant to be used.
It is mainly so as to be able to follow a relatively well-known software design principle called "Syntactic Consistency" from the book "Principles of Programming Languages" by Bruce J. MacLennan, which says the following:
Similar things should look similar, whereas
different things should look different.
When you see two functions with different names, you might be tempted to believe that they do different things. If they do in fact do different things, it is okay, but what if they do the same thing? In that case, it would be nice if the functions have the exact same name, so as to indicate that they do, in fact, do the same thing.
Of course you can misuse overloading. If you go around writing functions that do different things, taking advantage of overloading to give them the same name, then you are shooting yourself in the foot.

Multiple Dispatch: A conceptual necessity?

I wonder if the concept of multiple dispatch (that is, built-in support, as if the dynamic dispatch of virtual methods is extended to the method's arguments as well) should be included in an object-oriented language if its impact on performance would be negligible.
Problem
Consider the following scenario: I have a -- not necessarily flat -- class hierarchy containing types of animals. At different locations in my code, I want to perform some actions on an animal object. I do not care, nor can I control, how this object reference is obtained. I might encounter it by traversing a list of animals, or it might be given to me as one of a method's arguments. The action I want to perform should be specialized depending on the runtime type of the given animal. Examples of such actions would be:
Construct a view-model for the animal in order to present it in the GUI.
Construct a data object (to later store into the DB) representing this type of animal.
Feed the animal with some food, but give different kinds of food depending on the type of the animal (what is more healthy for it)
All of these examples operate on the public API of an animal object, but what they do is not the animal's own business, and therefore cannot be put into the animal itself.
Solutions
One "solution" would be to perform type checks. But this approach is error-prone and uses reflective features, which (in my opinion) is almost always an indication of bad design. Types should be a compile-time concept only.
Another solution would be to "abuse" (sort of) the visitor pattern to mimic double dispatch. But this would require that I change my animals to accept a visitor.
I am sure there are other approaches. Also, the problem of extension should be addressed: If new types of animals join the party, how many code locations need to be adapted, and how can I find them reliably?
The Question
So, in the light of these requirements, shouldn't multiple dispatch be an integral part of any well-designed object-oriented language?
Isn't it natural to make external (not just internal) actions dependent on the dynamic type of a given object?
Best regards!
You are suggesting dynamic dispatching based on method name / signature combined with runtime actual argument types. I think you're crazy.
So, in the light of these requirements, shouldn't multiple dispatch be an integral part of any well-designed object-oriented language?
That there are problems for which the availability of the kind of dispatch strategy you envision would simplify coding is a weak argument for such dispatch being built into any given language, much less every OO language.
Isn't it natural to make external (not just internal) actions dependent on the dynamic type of a given object?
Perhaps, but not everything that seems "natural" is in fact a good idea. Clothes are not natural, for instance, but see what happens if you try going around in public without (somewhere other than Berkeley, anyway).
Some languages already have static dispatch based on argument types, more conventionally called "overloading". Dynamic dispatch based on argument types, on the other hand, is a real mess if there is more than one argument to be considered, and it cannot help but be slow(er). Today's popular OO languages provide for you to perform double dispatch where it is wanted, without the overhead of supporting it in the vast majority of places where you don't want it.
Furthermore, although implementing double-dispatch does present maintenance issues arising from tight coupling between separate components, there are coding strategies that can help keep that manageable. It is anyway unclear to what extent having argument-based multiple dispatch built in to a given language would actually help with that problem.
One "solution" would be to perform type checks. But this approach is
error-prone and uses reflective features, which (in my opinion) is
almost always an indication of bad design. Types should be a
compile-time concept only.
You're wrong. All uses of virtual functions, virtual inheritance, and such things involve reflective features and dynamic types. The ability to defer typing until runtime when you need to is absolutely critical and is inherent in even the most basic formulation of the situation you're in, which literally cannot even arise without the use of dynamic types. You even describe your problem as wanting to do different things depending on.. the dynamic type. After all, if there is no dynamic typing, why would you need to do things differently? You already know the concrete final type.
Of course, a bit of run-time typing can handle the problem you got yourself into with run-time typing.
Simply build a dictionary/hash table from type to function. You can add entries to this structure dynamically for any dynamically linked derived types, it's a nice O(1) to look up into, and requires no internal support.
If one restricts oneself to the situation where knowledge of how an object of type X should fnorble an object of type Y must be stored in either class X or class Y, one can have the base type of Y include a method that accepts a reference of X's base type and indicates how much an object knows about how to be fnorbled by the object identified by that reference, as well as a method that asks the Y to have an X fnorble it.
Having done that, one can have X's Fnorble(Y) method start by asking the Y how much it knows about being fnorbled by a particular type of X. If the Y knows more about X than X knows about Y, then X's Fnorble(Y) method should call the Y's BeFnorbledBy(X) method; otherwise, the X should fnorble the Y as best it knows how.
Depending upon how many different kinds of X and Y there are, Y could define BeFnorbledBy overloads methods for different kinds of X, such that when X calls target.BeFnorbledBy(this) it would automatically dispatch directly to a suitable method; such an approach, however, would require every Y to know about every type of X that was "interesting" to anybody whether or not it had any interest that particular type itself.
Note that this approach doesn't accommodate the situation where there might be an outside object of class Z which knows things about how an X should fnorble a Y that neither X nor Y knows directly. That kinds of situation is best handled by having a "rulebook" object where everything that knows about how various kinds of X should fnorble various kinds of Y can tell the rulebook, and code which wants an X to fnorble a Y can ask the rulebook to make that happen. Although languages could provide assistance in cases where rulebooks are singletons, there may be times when it may be useful to have multiple rulebooks. The semantics in those cases are probably best handled by having code use rulebooks directly.

Downsides about using interface as parameter and return type in OOP

This is a question independent from languages.
Conceptually, it's good to code for interfaces(contracts) instead of specific implementations. I've got no problem understanding merits about the practice.
However, when I really code in that practice, the users of my classes, from time to time need to cast the interfaces for specific needs of specific functions provided by specific classes that implement that interface.
I understand there must be something wrong, either on my side or on the user's side, as the interface should expose all methods/properties(in the case of c#) that can possibly be necessary.
The code base is huge, and the users are clients.
It won't be particularly easy to make changes on either side.
That makes me wonder some downsides about using interface as parameter and return type.
Can people please list demerits of the practice? And please, include any solution if you know how to work around it.
Thanks a lot for enlightening me.
EDIT:
To be a bit more specific:
Assume we have a class called DbInfoExtractor. It has a public method GetInfo, as follows:
public IInformation GetInfo(IInfoParam);
where IInformation is an interface implemented by specific classes like VideoInfo, AudioInfo, TextInfo, etc; IInfoParam is an interface implemented by specific classes like VidoeInfoParam, AudioInfoParam, TextInfoParam, etc;
Apparently, depending on the specific object passed into the method GetInfo, the DbInfoExtractor needs to take different actions, as it is reasonable to assume that for different types of information, the extractor considers different sets of aspects(e.g. {size, title, date} for video, {title, author} for text information, etc) as search keys and search for relevant information in different ways.
Here, I see two options to go on:
1, using if ... else ... to decide what actually to take depending on the type of the parameter the GetInfo method receives. This is certainly bad, as avoiding this situation is one the very reasons we use polymorphism.
2, We should call IInfoParam.TakeAction(), and each specific implementation of IInfoParam has its own TakeAction() method to actually search and find the corresponding information from the database.
This options seems better, but still quite bad, as it shouldn't be the parameter that takes action searching and finding the information; it should be the responsibility of DbInfoExtractor.
So how can I delegate the TakeAction back to DbInfoExtractor? (I actually wrote some code to do this, but it's neither standard nor elegant. Basically I make parameter classes nested classes in DbInfoExtractor, so that they can call various versions of TakeAction of DbInfoExtractor.)
Please enlighten me!
Thanks.
Thanks.
Why not
public IVideoInformation GetVideoInformation(VideoQuery);
public IAudioInformation GetAudioInformation(AudioQuery);
// etc.
It doesn't look like there's a need for polymorphism here.
The query types are Query Objects, if you need those. They probably don't need to be interfaces; they know nothing about the database. A simple list of parameters (maybe just ID) might be sufficient.
The question is what does the client have, and what do they want? That's your interface.
Switch statements and casting are a smell, and typically mean that you've violated the Liskov substitution principle.

Design question: pass the fields you use or pass the object?

I often see two conflicting strategies for method interfaces, loosely summarized as follows:
// Form 1: Pass in an object.
double calculateTaxesOwed(TaxForm f) { ... }
// Form 2: Pass in the fields you'll use.
double calculateTaxesOwed(double taxRate, double income) { ... }
// use of form 1:
TaxForm f = ...
double payment = calculateTaxesOwed(f);
// use of form 2:
TaxForm f = ...
double payment = calculateTaxesOwed(f.getTaxRate(), f.getIncome());
I've seen advocates for the second form, particularly in dynamic languages where it may be harder to evaluate what fields are being used.
However, I much prefer the first form: it's shorter, there is less room for error, and if the definition of the object changes later you won't necessarily need to update method signatures, perhaps just change how you work with the object inside the method.
Is there a compelling general case for either form? Are there clear examples of when you should use the second form over the first? Are there SOLID or other OOP principles I can point to to justify my decision to use one form over the other? Do any of the above answers change if you're using a dynamic language?
In all honesty it depends on the method in question.
If the method makes sense without the object, then the second form is easier to re-use and removes a coupling between the two classes.
If the method relies on the object then fair enough pass the object.
There is probably a good argument for a third form where you pass an interface designed to work with that method. Gives you the clarity of the first form with the flexibility of the second.
It depends on the intention of your method.
If the method is designed to work specifically with that object and only that object, pass the object. It makes for a nice encapsulation.
But, if the method is more general purpose, you will probably want to pass the parameters individually. That way, the method is more likely to be reused when the information is coming from another source (i.e. different types of objects or other derived data).
I strongly recommend the second solution - calculateTaxesOwed() calculates some data, hence needs some numerical input. The method has absolutly nothing to do with the user interface and should in turn not consum a form as input, because you want your business logic separated from your user interface.
The method performing the calculation should (usualy) not even belong to the same modul as the user interface. In this case you get a circular dependency because the user interface requires the business logic and the business logic requires the user interface form - a very strong indication that something is wrong (but could be still solved using interface based programming).
UPDATE
If the tax form is not a user interface form, things change a bit. In this case I suggest to expose the value using a instance method GetOwedTaxes() or instance property OwedTaxes of the TaxForm class but I would not use a static method. If the calculation can be reused elsewhere, one could still create a static helper method consuming the values, not the form, and call this helper method from within the instance method or property.
I don't think it really matters. You open yourself to side effects if you pass in the Object as it might be mutated. This might however be what you want. To mitigate this (and to aid testing) you are probably better passing the interface rather than the concrete type. The benefit is that you don't need to change the method signature if you want to access another field of the Object.
Passing all the parameters makes it clearer what the type needs, and might make it easier to test (though if you use the interface this is less of a benefit). But you will have more refactoring.
Judge each situation on its merits and pick the least painful.
Passing just the arguments can be easier to unit test, as you don't need to mock up entire objects full of data just to test functionality that is essentially just static calculation. If there are just two fields being used, of the object's many, I'd lean towards just passing those fields, all else being equal.
That said, when you end up with six, seven or more fields, it's time to consider passing either the whole object or a subset of the fields in a "payload" class (or struct/dictionary, depending on the language's style). Long method signatures are usually confusing.
The other option is to make it a class method, so you don't have to pass anything. It's less convenient to test, but worth considering when your method is only ever used on a TaxForm object's data.
I realize that this is largely an artifact of the example used and so it may not apply in many real-world cases, but, if the function is tied so strongly to a specific class, then shouldn't it be:
double payment = f.calculateTaxesOwed;
It seems more appropriate to me that a tax document would carry the responsibility itself for calculating the relevant taxes rather than having that responsibility fall onto a utility function, particularly given that different tax forms tend to use different tax tables or calculation methods.
One advantage of the first form is
Abstraction - programming to an interface rather than implementation. It makes the maintainance of your code easier in the long run becuase you may change the implementation of TaxForm without affecting the client code as long as the interface of TaxForm does not change.
This is the same as the "Introduce Parameter Object" from Martin Fowler's book on refactoring. Fowler suggests that you perform this refactoring if there are a group of parameters that tend to be passed together.
If you believe in the Law of Demeter, then you would favor passing exactly what is needed:
http://en.wikipedia.org/wiki/Law_of_Demeter
http://www.c2.com/cgi/wiki?LawOfDemeter
Separation of UI and Data to be manipulated
In your case, you are missing an intermediate class, say, TaxInfo, representing the entity to be taxed. The reason is that UI (the form) and business logic (how tax rate is calculated) are on two different "change tracks", one changes with presentation technology ("the web", "The web 2.0", "WPF", ...), the other changes with legalese. Define a clear interface between them.
General discussion, using an example:
Consider a function to create a bitmap for a business card. Is the purpose of the function
(1) // Formats a business card title from first name and last name
OR
(2) // Formats a businnes card title from a Person record
The first option is more generic, with a weaker coupling, which is generally preferrable. However, In many cases less robust against change requests - e.g. consider "case 2017: add persons Initial to business card".
Changing the implementation (adding person.Initial) is usually easier and faster than changing the interface.
The choice is ultimately what type of changes you expect: is it more likely that more information from a Personrecord is required, or is it more likely that you want to create business card titles for other data structures than Person?
If that is "undecided", anfd you can't opf for purpose (1) or (2) I'd rather go with (2), for syntactic cleanliness.
If I was made to choose one of the two, I'd always go with the second one - what if you find that you (for whatever reason) need to caculate the taxes owed, but you dont have an instance of TaxForm?
This is a fairly trivial example, however I've seen cases where a method doing a relatively simple task had complex inputs which were difficult to create, making the method far more difficult to use than it should have been. (The author simply hadn't considered that other people might want to use that method!)
Personally, to make the code more readable, I would probbaly have both:
double calculateTaxesOwed(TaxForm f)
{
return calculateTaxesOwed(f.getTaxRate(), f.getIncome());
}
double calculateTaxesOwed(double taxRate, double income) { ... }
My rule of thumb is to wherever possible have a method that takes exactly the input it needs - its very easy to write wrapper methods.
Personally, I'll go with #2 since it's much more clear of what it is that the method need. Passing the TaxForm (if it is what I think it is, like a Windows Form) is sort of smelly and make me cringe a little (>_<).
I'd use the first variation only if you are passing a DTO specific to the calculation, like IncomeTaxCalculationInfo object which will contain the TaxRate and Income and whatever else needed to calculate the final result in the method, but never something like a Windows / Web Form.

methods: multiple parameters or structure?

I noticed by looking at sample code from Apple, that they tend to design methods that receive structures instead of multiple parameters. Why is that? As far as ease of use, I personally prefer the latter, but as far as performance goes, is there one better choice than the other?
[pencil drawPoint:Point3Make(20,40,60)]
[pencil drawPointAtX:20 Y:50 Z:60]
Don't muddle this question with concerns of performance. Don't make premature optimizations (until you know you have a problem) and when thinking about performance hot spots in your code, its almost always in areas dealing with I/O (eg, database, files). So, separate your question on message passing style with performance. You want to make the best design decision first, then optimize for performance only if needed.
With that being said, Apple does not recommend or prefer passing multiple parameters vs a structure/object. Generalizing this outside of the scope of Objective-C, use individuals parameters or objects when it makes sense in the particular scenario. In other words, there isn't a black and white answer that you can follow. Instead, use the following guidelines when deciding:
Pass objects/structures when it makes sense for the method to understand many/all members of the object
Pass objects/structures when you want to validate some rules on the relationship between the various members of the object. This allows you to ensure the consumer of your method constructs a valid object prior to calling your method (thus eliminating the need of the method to validate these conditions).
Pass individual arguments when it is clear the method makes sense and only needs certain elements rather than the entire object
Using a variation on your example, a paint method that takes two coordinates (X and Y) would benefit from taking a Point object rather than two variables, X and Y.
A method retrieveOrderByIdAndName would best be designed by taking the single id and name parameter rather than some container object.
Now, if there was some method to retrieve orders by many different criterion, it would make more send to create a retrieveOrderByCriteria and pass it some criteria structure.
If you are passing the same set of parameters around it is useful to pass them in a structure because they belong together semantically.
The performance hit is probably negligible for such a simple structure as 3 points. Use the readable/reusable solution and then profile your code if you think it is slow :)