Currently we are writing our bachelor thesis about the implementation of a Compiler for an academic object-oriented mini programming language.
We want to be precise in our documentation, and we're currently discussing if a constructor is a routine.
What we think points out that a constructor is a routine is that it has a block of Commands, Parameters and local variables. Despite the missing name, all other attributes of other routines are given.
What we think points out that a constructor is not a routine is that it can only be called once per instance.
We are not sure if this question has a clear answer, or if the definition is different from theory to theory.
We would be happy if someone could give a pointer to some literature about this semantic question.
Best
Edit: Some Information about how we name specific things in our Language:
We have functions and procedures. Functions do have a return value, procedures don't.
A constructor is like an unnamed procedure (without explicit return value)
a constructor is called implicit, java like: x := new X(1, new Y())
Parameters are defined during the definition of a constructor. The own instance (this) is not considered a parameter but provided implicitly
Thanks for your answers so far, they're helping with the though process.
This depends on language - and for this academic language - I would not say that a constructor is a routine. I say that because in not saying that it is a routine, a separation is kept: unless the language explicitly unifies routines/functions/constructors, don't say it does :)
Now, consider these counter-examples (and there are many more, I am sure):
Languages like Eiffel allow giving constructors different names (which I think is awesome and wish was used more).
Languages like Ruby don't have a "new" operator and invoking a constructor appears as invoking any (class) method. Ruby doesn't even have a way of signaling that a method acts as a constructor (or factory method, as it were).
Constructors in languages like JavaScript are just functions which can be run in a special context when used with new.
Also, at some level it may be viewed that there needs to be no difference in calling a constructor multiple times (you get back a new object - so what?) than calling a function multiple times (where one might get back the same value). Consider that the new object may be immutable and may have value equality with other objects.
That is, considering the following code, is there a constructor used?
5 4 vec2 "1" int 2 vec2 add puts
I made it up, but I hope it makes a point. There may or may not be a constructor or an external difference between a constructor and an ordinary function depending upon how the specific language views the role (or even need) of constructors.
Now, write the language specification as deemed fit and try to avoid leaking implementation details.
Constructor is a constructor.
It may be like a function(that returns value: the new object), procedure(routine, function with no return value, called on uninitialized object), it may be callable once or many times on an object (although it is arguable whever the object is of the same identity afterwards..), it may have a name or not or the name may be enforced to match the class, etc. The constructor may even "not exist" or be implicitly created by the compiler from various scattered initializers and code blocks, which otherwise would be expressions/routines/whatchamacallit.
It all depends on your language that you compile and on what do you mean by 'function', 'routine', or even 'parameters' (i.e. is 'this' a parameter?).
If you want to ask about such thing, first describe/define your language and all your terms that you want to use (what is a class? method? function? routine? parameter? constructor? ...) and then, well, most probably you will automatically deduce the answer matching your ontology.
A constructor is a function with special semantics (such that it is called in specific context - as part of object construction), but it is a function anyway - it can have parameters, it has usual flow of control, it can have local variables, etc. It is not better or worse than any other function. I'd say it is a routine.
From outside, a constructor can be seen as a class method, with an instance of that class as return value. Insofar, the claim that "it can only be used once per instance" does not hold water, since there is no instance yet when the constructor is used.
From inside, some special keywordish name like "this" is bound to the uninitialized instance.
Usually, there is some syntactic sugar, like a new keyword. Also, the compiler may help to make sure the instance is properly initialized.
It is special insofar as the functionality of creating a new object is nowhere else provided. But as far as its usage is concerned, a constructor is not (or at least should not be) different from any other class method that happens to return an instance of the class.
BTW, is "routine" an established term in OOP?
I think that a Routine is what is that can be called explicitly as and when required by the caller on a constructed object/class, while a constructor can be called a special type of routine that is called at runtime when the instance of the class is requested.
A constructor helps only in constructing and initializing the class
object and its variables.
It may or may not accept parameters, it can be overloaded with
different set of parameters
If the constructor has no parameters and also no code inside its code
block, you may want to omit it
Some languages automatically create a default parameter-less
constructor (like C#) if you do not provide your own constructor
A constructor can have an access modifier to restrict the creation
scope of the class
A constructor cannot have a return type because its constructing the
same class in which it is declared, and obviously there is no point
returning the same type (may be that's the reason some languages use same name for the constructor as the class name)
All the implementation rules for a constructor differ from language to language
Furthermore, the most important requirement of a well written constructor is that after it is executed it should leave the class object in a valid state
A constructor (as in the name) is only executed by the compiler when you create a new instance of that class.
The general idea is this: You put some set of operations which should be executed during the startup and that is what is done on the constructor. So this implies, you cannot call a constructor just like the other methods of your class.
Related
The book I am reading just introduced the concepts of subsumption, covariance, contravariance and its implications for the design of programming languages. Now, in the section about method specialization, I'm getting into trouble.
The book points out that, when we override methods in subclasses, parameter types may be generalized, and result types may be specialized. However, the book argues, that for methods that aren't overridden but inherited, another type of method specialization is at play where parameter types are specialized, and result types are generalized, because we can interpret the self keyword as an argument:
"There is another form of method specialization that happens implicitly by inheritance. The occurrences of self in the methods of C can be considered of type C, in the sense that all objects bound to self have type C or a subtype of C. When the methods of C are inherited by C', the same occurrences of self can similarly be considered of type C'. Thus the type of self is silently specialized on inheritance (covariantly!)."
And here is my problem: Can't we consider the self keyword a covariant argument in overridden methods as well? Then we would end up with the this keyword as a covariant argument, even though we just established that, as a consequence of the substitution principle, arguments of overridden methods need to be contravariant. Am I overlooking something?
Thanks for your help!
Implementation of variance in programming languages
...when we override methods in subclasses, parameter types may be generalized (covariance), and result types may be specialized (contravariance)
Even though this can be true, it depends on the specific language, whether it actually implements this functionality. I have found examples on wiki:
C++ and Java implement only covariant return types and the method
argument types are invariant.
C# does not implement either variancy (so both are invariant).
There is an example of a language (Sather) with contravariant argument type and covariant return type - this is what you mentioned.
However, there is also one (Eiffel) with covariant return & argument type, but this can normally cause runtime errors.
Controlling and "left-over" arguments
There is also a nice paragraph dividing arguments between controlling arguments and "left-over" arguments. The controlling ones are covariant and the non-controlling ones are contravariant. This is regarding multiple dispatch langauges, although most certainly you were referring to a single dispatch language. But even there is a single controlling argument (self/this).
Here is the paragraph (I did not have time to study the paper it is referring to, please feel free to read it if you have the time and post your findings):
Giuseppe Castagna[3] observed that in a typed language with multiple dispatch, a generic function can have some arguments which control dispatch and some "left-over" arguments which do not. Because the method selection rule chooses the most specific applicable method, if a method overrides another method, then the overriding method will have more specific types for the controlling arguments. On the other hand, to ensure type safety the language still must require the left-over arguments to be at least as general. Using the previous terminology, types used for runtime method selection are covariant while types not used for runtime method selection of the method are contravariant.
Conventional single-dispatch languages like Java also obey this rule: there only one argument is used for method selection (the receiver object, passed along to a method as the hidden argument this), and indeed the type of this is more specialized inside overriding methods than in the superclass.
The problem
According to the paragraph I assume that the self argument in its nature is not a regular method argument (which may be contravariant), because self is an another kind of argument - controlling argument - which are covariant.
...even though we just established that, as a consequence of the substitution principle, arguments of overridden methods need to be contravariant.
Well, it looks like not all of them.
I have a class which represents a set of numbers. The constructor takes three arguments: startValue, endValue and stepSize.
The class is responsible for holding a list containing all values between start and end value taking the stepSize into consideration.
Example: startValue: 3, endValue: 1, stepSize = -1, Collection = { 3,2,1 }
I am currently creating the collection and some info strings about the object in the constructor. The public members are read only info strings and the collection.
My constructor does three things at the moment:
Checks the arguments; this could throw an exception from the constructor
Fills values into the collection
Generates the information strings
I can see that my constructor does real work but how can I fix this, or, should I fix this? If I move the "methods" out of the constructor it is like having init function and leaving me with an not fully initialized object. Is the existence of my object doubtful? Or is it not that bad to have some work done in the constructor because it is still possible to test the constructor because no object references are created.
For me it looks wrong but it seems that I just can't find a solution. I also have taken a builder into account but I am not sure if that's right because you can't choose between different types of creations. However single unit tests would have less responsibility.
I am writing my code in C# but I would prefer a general solution, that's why the text contains no code.
EDIT: Thanks for editing my poor text (: I changed the title back because it represents my opinion and the edited title did not. I am not asking if real work is a flaw or not. For me, it is. Take a look at this reference.
http://misko.hevery.com/code-reviewers-guide/flaw-constructor-does-real-work/
The blog states the problems quite well. Still I can't find a solution.
Concepts that urge you to keep your constructors light weight:
Inversion of control (Dependency Injection)
Single responsibility principle (as applied to the constructor rather than a class)
Lazy initialization
Testing
K.I.S.S.
D.R.Y.
Links to arguments of why:
How much work should be done in a constructor?
What (not) to do in a constructor
Should a C++ constructor do real work?
http://misko.hevery.com/code-reviewers-guide/flaw-constructor-does-real-work/
If you check the arguments in the constructor that validation code can't be shared if those arguments come in from any other source (setter, constructor, parameter object)
If you fill values into the collection or generate the information strings in the constructor that code can't be shared with other constructors you may need to add later.
In addition to not being able to be shared there is also being delayed until really needed (lazy init). There is also overriding thru inheritance that offers more options with many methods that just do one thing rather then one do everything constructor.
Your constructor only needs to put your class into a usable state. It does NOT have to be fully initialized. But it is perfectly free to use other methods to do the real work. That just doesn't take advantage of the "lazy init" idea. Sometimes you need it, sometimes you don't.
Just keep in mind anything that the constructor does or calls is being shoved down the users / testers throat.
EDIT:
You still haven't accepted an answer and I've had some sleep so I'll take a stab at a design. A good design is flexible so I'm going to assume it's OK that I'm not sure what the information strings are, or whether our object is required to represent a set of numbers by being a collection (and so provides iterators, size(), add(), remove(), etc) or is merely backed by a collection and provides some narrow specialized access to those numbers (such as being immutable).
This little guy is the Parameter Object pattern
/** Throws exception if sign of endValue - startValue != stepSize */
ListDefinition(T startValue, T endValue, T stepSize);
T can be int or long or short or char. Have fun but be consistent.
/** An interface, independent from any one collection implementation */
ListFactory(ListDefinition ld){
/** Make as many as you like */
List<T> build();
}
If we don't need to narrow access to the collection, we're done. If we do, wrap it in a facade before exposing it.
/** Provides read access only. Immutable if List l kept private. */
ImmutableFacade(List l);
Oh wait, requirements change, forgot about 'information strings'. :)
/** Build list of info strings */
InformationStrings(String infoFilePath) {
List<String> read();
}
Have no idea if this is what you had in mind but if you want the power to count line numbers by twos you now have it. :)
/** Assuming information strings have a 1 to 1 relationship with our numbers */
MapFactory(List l, List infoStrings){
/** Make as many as you like */
Map<T, String> build();
}
So, yes I'd use the builder pattern to wire all that together. Or you could try to use one object to do all that. Up to you. But I think you'll find few of these constructors doing much of anything.
EDIT2
I know this answer's already been accepted but I've realized there's room for improvement and I can't resist. The ListDefinition above works by exposing it's contents with getters, ick. There is a "Tell, don't ask" design principle that is being violated here for no good reason.
ListDefinition(T startValue, T endValue, T stepSize) {
List<T> buildList(List<T> l);
}
This let's us build any kind of list implementation and have it initialized according to the definition. Now we don't need ListFactory. buildList is something I call a shunt. It returns the same reference it accepted after having done something with it. It simply allows you to skip giving the new ArrayList a name. Making a list now looks like this:
ListDefinition<int> ld = new ListDefinition<int>(3, 1, -1);
List<int> l = new ImmutableFacade<int>( ld.buildList( new ArrayList<int>() ) );
Which works fine. Bit hard to read. So why not add a static factory method:
List<int> l = ImmutableRangeOfNumbers.over(3, 1, -1);
This doesn't accept dependency injections but it's built on classes that do. It's effectively a dependency injection container. This makes it a nice shorthand for popular combinations and configurations of the underlying classes. You don't have to make one for every combination. The point of doing this with many classes is now you can put together whatever combination you need.
Well, that's my 2 cents. I'm gonna find something else to obsess on. Feedback welcome.
As far as cohesion is concerned, there's no "real work", only work that's in line (or not) with the class/method's responsibility.
A constructor's responsibility is to create an instance of a class. And a valid instance for that matter. I'm a big fan of keeping the validation part as intrinsic as possible, so that you can see the invariants every time you look at the class. In other words, that the class "contains its own definition".
However, there are cases when an object is a complex assemblage of multiple other objects, with conditional logic, non-trivial validation or other creation sub-tasks involved. This is when I'd delegate the object creation to another class (Factory or Builder pattern) and restrain the accessibility scope of the constructor, but I think twice before doing it.
In your case, I see no conditionals (except argument checking), no composition or inspection of complex objects. The work done by your constructor is cohesive with the class because it essentially only populates its internals. While you may (and should) of course extract atomic, well identified construction steps into private methods inside the same class, I don't see the need for a separate builder class.
The constructor is a special member function, in a way that it constructor, but after all - it is a member function. As such, it is allowed to do things.
Consider for example c++ std::fstream. It opens a file in the constructor. Can throw an exception, but doesn't have to.
As long as you can test the class, it is all good.
It's true, a constructur should do minimum of work oriented to a single aim - successful creaation of the valid object. Whatever it takes is ok. But not more.
In your example, creating this collection in the constructor is perfectly valid, as object of your class represent a set of numbers (your words). If an object is set of numbers, you should clearly create it in the constructor! On the contrary - the constructur does not perform what it is made for - a fresh, valid object construction.
These info strings call my attention. What is their purpose? What exactly do you do? This sounds like something periferic, something that can be left for later and exposed through a method, like
String getInfo()
or similar.
If you want to use Microsoft's .NET Framework was an example here, it is perfectly valid both semantically and in terms of common practice, for a constructor to do some real work.
An example of where Microsoft does this is in their implementation of System.IO.FileStream. This class performs string processing on path names, opens new file handles, opens threads, binds all sorts of things, and invokes many system functions. The constructor is actually, in effect, about 1,200 lines of code.
I believe your example, where you are creating a list, is absolutely fine and valid. I would just make sure that you fail as often as possible. Say if you the minimum size higher than the maximum size, you could get stuck in an infinite loop with a poorly written loop condition, thus exhausting all available memory.
The takeaway is "it depends" and you should use your best judgement. If all you wanted was a second opinion, then I say you're fine.
It's not a good practice to do "real work" in the constructor: you can initialize class members, but you shouldn't call other methods or do more "heavy lifting" in the constructor.
If you need to do some initialization which requires a big amount of code running, a good practice will be to do it in an init() method which will be called after the object was constructed.
The reasoning for not doing heavy lifting inside the constructor is: in case something bad happens, and fails silently, you'll end up having a messed up object and it'll be a nightmare to debug and realize where the issues are coming from.
In the case you describe above I would only do the assignments in the constructor and then, in two separate methods, I would implement the validations and generate the string-information.
Implementing it this way also conforms with SRP: "Single Responsibility Principle" which suggests that any method/function should do one thing, and one thing only.
I often need to decide between these two strategies for the object design:
An object that is fully initialised and ready to use after its construction. The constructor often requires a complex list of parameters, hence the object initialisation is nontrivial. All objects having it as a member variable will also need nontrivial constructors. This may lead to code whose complexity is concentrated at object constructors, often making the code hard to follow.
An object with default constructor. The object variables are set individually by means of setter methods. This approach has the disadvantage that most methods need to check whether the object is fully initialized, hence complicating the code.
What is your personal preference between the two, and how do you decide when to use one or the other?
In my opinion if a constructor is getting too bloated it's time to split up your object in more different, smaller objects. This might be impossible in some rare cases, but in most cases it can be done.
Neither.
Huge parameter lists indicates the object does too much. Lots of properties that need to be set before the object can have a valid and useful output indicates it does too much.
So neither approach is a solution as far as I'm concerned.
There are lots of ways to break these things up, but outside of a specific scenario, the only rule is, "It needs doing".
Aggregation into other objects, "controller" classes, various communicator patterns. Are some categories first class objects, can some be hidden in the implementation.
I don't accept that the two options you present are the only ones, except possibly from a pragmatic point of view in terms of getting the code out of the door. Which one I was then forced to choose, would simply depend on how many calls to the constructor with different parameters the code required, versus how much validation would be needed to confirm all the properties were set, and possibly the impact on unit tests, which because the object is a mess would be unwieldy or limited.
If a constructor takes many arguments — you call this non-trivial object initialisation — and you don't want to split up your class into smaller ones, then one alternative is to put the parameters into a Parameter Object and then only pass that object to the constructor.
Second, I believe that you should distinguish between...
object properties that absolutely must be set if the object is supposed to do its work, and there is no sensible default value. These properties should be initialised via a constructor parameter.
object properties that can be set optionally, or overridden, by the user. While you might initialise such properties in the constructor, you don't have to have a separate constructor parameter for them. Instead, you might assign a sensible default value to them that still can be overridden by the user through a setter method.
There is also an alternative to the first type of properties (those that must absolutely have a user-provided value): properties which are provided through overriding an abstract getter in a derived class:
abstract class ComplicatedFoo {
protected abstract T getSomeDependency(); // replaces required ctor parameter
}
P.S.: The book "Dependency Injection" by Dhanji R. Prasanna (Manning Publications) gives a good overview of the various ways how to initialise an object.
It's always good to initialize all your variables in the constructor, but to a default value. If it is difficult to get the value of the variable (for example, you have to call some function somewhere to get that value), you may set that value to an invalid one and then later you set the correct value.
It is not a good idea to make the constructor so complex, because you can't return an error in the constructor (I don't know if it is ok to throw an exception in the constructor or not, because I particulary don't like trhowing exceptions anywhere). Also, you can't call virtual functions there, and so on.
An approach I like when the construction of the class is complex is to create an "init" function. Then I can do something like:
Person::Person()
{
age = -1;
...
}
int Person::Init()
{
age = functionThatReturnsTheAgeFromSomeDB();
if (age == -1 )
{
return DB_ERROR;
}
...
}
And so on.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I wonder if can I say that a constructor is a special case of a method?
You can say anything. Whether anyone will disagree with you depends on the context. Some language communities and standards define things that way.
More elaborately, it depends on what you mean by a 'method.' In C++, for example, one way to analyze the creation process is to say that it consists of a call to an operator new (perhaps just placement) followed by a call to a constructor method. From an implementation standpoint, a constructor looks, walks, and quacks like a method. In some compilers, you can even invoke one explicitly.
From a more theoretical viewpoint, someone might claim that constructors are some distinctive species. However, there is no single, true, privileged conceptual model of methods, constructors, or purple unicorns.
Gosh this is all subjective.
You could say so, just as you can say that a human is a special case of animal, however in most contexts mentioning animals implies non-human animals and mentioning methods implies non-constructor methods.
Technically, a constructor usually is a method. Whether it really is or is not depends largely on the particular environment. For example, in .NET constructors are methods called actually after an object is created. However, it's also possible to create an object without having a constructor called right after.
Update: Regarding .NET, or the Common Language Infrastructure to be more precise, ECMA 335, section 8.9.6.6 Constructors states:
New values of an object type are created via constructors. Constructors shall be instance methods, defined via a special form of method contract, which defines the method contract as a constructor for a particular object type.
I think a constructor is too special to be called a method
It doesn't return anything
It modifies the object before the object is initialized
It cannot call itself (imagine that)
blah blah blah
There might be difference between languages, but I don't think I'm going as far as calling a constructor "special method".
In languages that have constructors, you can usually think of a constructor as a special case of a factory method. (Note: I don't mean the GoF Factory Method Software Design Pattern, I'm just talking about any class method that creates new instances.) Usually, this "special casing" generally takes the form of annoying restrictions (e.g. in Java, you can only call the parent constructor at the beginning of the constructor), which is why even in languages that do have constructors, you often end up using or writing factory methods anyway.
So, if constructors are basically factory methods with restrictions, there is really no need to have them both, and thus many languages simply get rid of constructors. Examples include Objective-C, Ruby, Smalltalk, Self, Newspeak, ECMAScript/JavaScript, Io, Ioke, Seph and many others.
In Ruby, the closest thing to a constructor is the method Class#allocate, which simply allocates an empty object and sets that object's class pointer. Nothing more. Since such an empty object is obviously unusable, it needs to initialized. Per convention, this initialization is performed by #initialize. As a convenience, because it is cumbersome to always have to remember to both allocate and initialize (as any Objective-C developer can probably attest), there is a helper method called Class#new, which looks something like this:
class Class
def new(*args, &block)
obj = allocate
obj.initialize(*args, &block)
return obj
end
end
This allows you to replace this:
foo = Foo.allocate
foo.initialize(bar)
With this:
foo = Foo.new(bar)
It is important to note that there is nothing special about any of these methods. Well, with one exception: Class#allocate obviously has to be able to set the class pointer and to allocate memory, which is something that is not possible in Ruby. So, this method has to somehow come from outside the system, which e.g. in MRI means that it is written in C, not Ruby. But that only concerns the implementation. There are no special dispatch rules, no special override rules. It's just a method like any other that can e.g. call super whereever, whenever and how often it wants and can return what it wants.
"Special" is the magic word here. There's absolutely nothing wrong with calling a constructor a special method, but what "special" implies can vary depending on the language.
In most cases, "special" means they can't return values or be called as a method without creating a new object. But there are always exceptions: a prime example is JavaScript, where a constructor is no different from a normal function, it can return its own values and it can be called either as a constructor or as a simple function.
At least in vb.net, constructors can have a non-standard control flow. If the first statement of a constructor is a call to New (of either the same type or a base type), the sequence of events will be: (1) perform the call; (2) initialize all the fields associated with the type; (3) finish handling the rest of the constructor. No normal method has that sort of control flow. This control flow makes it possible to do things like pass constructor parameters to field initializers of a derived type, if the base type is written to allow such.
#Tom Brito, personally I would agree with you that a constructor is a special case of method.
Also, see below:
A constructor in a class is a special type of subroutine called at the creation of an object.
... A constructor resembles an instance method, but it differs from a method in that it never has an explicit return-type...
Source:
Wikipedia
Also, you may read my comments on others' comment (woot4moo, phunehehe).
In my design I am using objects that evaluate a data record. The constructor is called with the data record and type of evaluation as parameters and then the constructor calls all of the object's code necessary to evaluate the record. This includes using the type of evaluation to find additional parameter-like data in a text file.
There are in the neighborhood of 250 unique evaluation types that use the same or similar code and unique parameters coming from the text file.
Some of these evaluations use different code so I benefit a lot from this model because I can use inheritance and polymorphism.
Once the object is created there isn't any need to execute additional code on the object (at least for now) and it is used more like a struct; its kept on a list and 3 properties are used later.
I think this design is the easiest to understand, code, and read.
A logical alternative I guess would be using functions that return score structs, but you can't inherit from methods so it would make it kind of sloppy imo.
I am using vb.net and these classes will be used in an asp.net web app as well as in a distributed app.
thanks for your input
Executing code in a constructor is OK; but having only properties with no methods might be a violation of the tell don't ask principle: perhaps instead those properties should be private, and the code which uses ("asks") those properties should become methods of the class (which you can invoke or "tell").
In general, putting code that does anything significant in the constructor a not such a good idea, because you'll eventually get hamstrung on the rigid constructor execution order when you subclass.
Constructors are best used for getting your object to a consistent state. "Real" work is best handled in instance methods. With the work implemented as a method, you gain:
separation of what you want to evaluate from when you want to evaluate it.
polymorphism (if using virtual methods)
the option to split up the work into logical pieces, implementing each piece as a concrete template method. These template methods can be overridden in subclasses, which provides for "do it mostly like my superclass, but do this bit differently".
In short, I'd use methods to implement the main computation. If you're concerned that an object will be created without it's evaluation method being called, you can use a factory to create the objects, which calls the evaluate method after construction. You get the safety of constructors, with the execution order flexibility of methods.