Is it OK to create an object inside a function - oop

I work on a class in VBA, that encapsulates downloading stuff with MSXML2.XmlHttp.
There are three possibilities for the return value: Text, XML and Stream.
Should I create a function for each:
aText=myDownloader.TextSynchronous(URL,formData,dlPost,....)
aXml.load myDownloader.XmlSynchronous(URL,formData,dlPost,....)
Or can I just return the XmlHttpObject I created inside the class and then have
aText=myDownloader.Synchronous(URL,formData,dlPost,.....).ResponseText
aXML=myDownloader.Synchronous(URL,formData,dlPost,.....).ResponseXML
In the former case I can set the obj to nothing in the class but have to write several functions that are more or less the same.
In the latter case, I relay on the "garbage collector" but have a leaner class.
Both should work, but which one is better coding style?

In my opinion, the first way is better because you don't expose low level details to a high level of the abstraction.
I did something similar with a web crawler in Java, so I have a class only to manipulate the URL connection getting all the needed data (low level) and a high level class using the low level class that return an object called Page.
You can have a third method that only execute myDownloader.Synchronous(URL,formData,dlPost,.....) and stores the returned object in a private variable and the others method only manipulate this object. This form, you will only open the connection one time.

After much seeking around in the web (triggered by the comment by EmmadKareem) I found this:
First of all, Dont do localObject=Nothing at the end of a method - the variable goes out of scope anyway and is discarded. see this older but enlightening post on msdn
VBA uses reference counting and apart from some older bugs on ADO this seems to work woute well and (as I understand) immediately discards ressources that are not used anymore. So from a performance/memory usage point of view this seems not to be a problem.
As to the coding style: I think the uncomfortable fdeeling I had when I designed this could go away by simply renaming the function to myDownloader.getSyncDLObj(...) or some such.
There seem to be two camps on codestyle. One promotes clean code, which is easy to read, but uses five lines everytime you use it. Its most important prerogative is "every function should do one thing and one thing only. Their approach would probably look something like
myDownloader.URL="..."
myDownloader.method=dlSync
myDownloader.download
aText=myDownloader.getXmlHttpObj.ResponseText
myDownloader.freeResources
and one is OK with the more cluttered, but less lineconsuming
aText=myDownloader.getSyncObj(...).ResponseText
both have their merits both none is wrong, dangerous or frowned upon. As this is a helper class and I use it to remove the inner workings of the xmlhttp from the main code I am more comfortable with the second approach here. (One line for one goal ;)
I would be very interested on anyones take on that matter

Related

Colour Highlighting of SIMILAR or ALTERNATIVE blocks of Code

Is there some way to highlight similar or alternative code blocks in xCode or alternative Obj-C program? At the attached picture you can quickly realise which block runs after another or which is an alternative (if-else). (The code on the picture is just example). It seems to be the task related to {}-counting, so I expect there is some implementation.
Actually, I could understand the FLOW of the code on the picture only after I highlighted it as you see.
What you are asking for is scope highlighting. Xcode does not do this to my knowledge. However, you can mouse over the code folding column to the left and see the scope briefly.
Since this is kind of a non-answer, I originally wrote it as a comment. But it got too long, so I'm putting it here. Apologies in advance for avoiding the question. But I'm trying to address the problem behind the question…
If you have to color-code a method to understand it, the method is too long. Extract methods and give them meaningful names that clearly state their purpose (though not their implementation).
Here are rules-of-thumb I follow:
If a method has many local variables, or a number of different indentations, first use Extract Class to pull the method into a new class that works as a function object. Then promote the local variables to ivars.
Despite many who state the dangers, if an inner scope has one line, I omit the braces. This reduces the vertical distance of the code, which makes it more readable. Readability is more important. (But this may be risky if your code isn't covered by unit tests. So make sure the method is well-covered.)
When I see braces inside a method, I try to extract that portion into another (well-named) method.
Look for opportunities to extract the contents of if statements into predicate methods that express the what (burying the how inside the method).
I try to keep methods under six lines. Any more than that, and I start eyeing it critically: Is it doing more than one thing? Is it operating at more than one level of abstraction?
For much, much more on these principles, I highly recommend Clean Code episode 3.

What are the drawbacks of encapsulating arguments for different cases in one object?

I'll give you an example about path finding. When you wnat to find a path, you can pick a final destination, a initial position and find the fastest way between the two, or you can just define the first position, and let the algorithm show every path you can finish, or you may want to mock this for a test and just say the final destination and assume you "teleport" to there, and so on. It's clear that the function is the same: finding a path. But the arguments may vary between implementations. I've searched a lot and found a lot of solutions: getting rid of the interface, putting all the arguments as fields in the implementation, using the visitor pattern...
But I'd like to know from you guys what is the drawback of putting every possible argument (not state) in one object (let's call it MovePreferences) and letting every implementation take what it needs. Sure, may you need another implementation that takes as argument that you didn't expect, you will need to change the MovePreferences, but it don't sound too bad, since you will only add methods to it, not refactor any existing method. Even though this MovePreferences is not an object of my domain, I'm still tempted to do it. What do you think?
(If you have a better solution to this problem, feel free to add it to your answer.)
The question you are asking is really why have interfaces at all, no, why have any concept of context short of 'whatever I need?' I think the answers to that are pretty straightforward: programming with shared global state is easy for you, the programmer, and quickly turns into a vortex for everyone else once they have to coalesce different features, for different customers, render enhancements, etc.
Now the far other end of the spectrum is the DbC argument: every single interface must be a highly constrained contract that not only keeps the knowledge exchanged to an absolute minimum, but makes the possibility of mayhem minimal.
Frankly, this is one of the reasons why dependency injection can quickly turn into a mess: as soon as design issues like this come up, people just start injecting more 'objects,' often to get access to just one property, whose scope might not be the same as the scope of the present operation. [Different kind of nightmare.]
Unfortunately, there's almost no information in your question. Do I think it would be possible to correctly model the notion of a Route? Sure. That doesn't sound very challenging. Here are a few ideas:
Make a class called Route that has starting and ending points. Then a collection of Traversals. The idea here would be that a Route could completely ignore the notion of how someone got from point a to point b, where traversal could contain information about roads, traffic, closures, whatever. Then your mocked case could just have no Traversals inside.
Another option would be to make Route a Composite so that each trip is then seen as the stringing together of various segments. That's the way routes are usually presented: go 2 miles on 2 South, exit, go 3 miles east on Santa Monica Boulevard, etc. In this scenario, you could just have Routes that have no children.
Finally, you will probably need a creational pattern. Perhaps a Builder. That simplifies mocking things too because you can just make a mock builder and have it construct Routes that consist of whatever you need.
The other advantage of combining the Composite and Builder is that you could make a builder that can build a new Route from an existing one by trying to improve only the troubling subsegments, e.g. it got traffic information that the 2S was slow, it could just replace that one segment and present its new route.
Consider an example,
Say if 5 arguments are encapsulated in an object and passed on to 3 methods.
If the object undergoes change in structure, then we need to run test cases for all the 3 methods. Instead if the method accepts only the arguments they need, they need not be tested.
Only problem I see out of this is Increase in Testing Efforts
Secondly you will naturally violate Single Responsibility Principle(SRP) if you pass more arguments than what the method actually needs.

How much responsibility should a method have?

This is most certainly a language agnostic question and one that has bothered me for quite some time now. An example will probably help me explain the dilemma I am facing:
Let us say we have a method which is responsible for reading a file, populating a collection with some objects (which store information from the file), and then returning the collection...something like the following:
public List<SomeObject> loadConfiguration(String filename);
Let us also say that at the time of implementing this method, it would seem infeasible for the application to continue if the collection returned was empty (a size of 0). Now, the question is, should this validation (checking for an empty collection and perhaps the subsequent throwing of an exception) be done within the method? Or, should this methods sole responsibility be to perform the load of the file and ignore the task of validation, allowing validation to be done at some later stage outside of the method?
I guess the general question is: is it better to decouple the validation from the actual task being performed by a method? Will this make things, in general, easier at a later stage to change or build upon - in the case of my example above, it may be the case at a later stage where a different strategy is added to recover from the event of an empty collection being return from the 'loadConfiguration' method..... this would be difficult if the validation (and resulting exception) was being done in the method.
Perhaps I am being overly pedantic in the quest for some dogmatic answer, where instead it simply just relies on the context in which a method is being used. Anyhow, I would be very interested in seeing what others have to say regarding this.
Thanks all!
My recommendation is to stick to the single responsibility principle which says, in a nutshell, that each object should have 1 purpose. In this instance, your method has 3 purposes and then 4 if you count the validation aspect.
Here's my recommendation on how to handle this and how to provide a large amount of flexibility for future updates.
Keep your LoadConfig method
Have it call the a new method for reading the file.
Pass the previous method's return value to another method for loading the file into the collection.
Pass the object collection into some validation method.
Return the collection.
That's taking 1 method initially and breaking it into 4 with one calling 3 others. This should allow you to change pieces w/o having any impact on others.
Hope this helps
I guess the general question is: is it
better to decouple the validation from
the actual task being performed by a
method?
Yes. (At least if you really insist on answering such a general question – it’s always quite easy to find a counter-example.) If you keep both the parts of the solution separate, you can exchange, drop or reuse any of them. That’s a clear plus. Of course you must be careful not to jeopardize your object’s invariants by exposing the non-validating API, but I think you are aware of that. You’ll have to do some little extra typing, but that won’t hurt you.
I will answer your question by a question: do you want various validation methods for the product of your method ?
This is the same as the 'constructor' issue: is it better to raise an exception during the construction or initialize a void object and then call an 'init' method... you are sure to raise a debate here!
In general, I would recommend performing the validation as soon as possible: this is known as the Fail Fast which advocates that finding problems as soon as possible is better than delaying the detection since diagnosis is immediate while later you would have to revert the whole flow....
If you're not convinced, think of it this way: do you really want to write 3 lines every time you load a file ? (load, parse, validate) Well, that violates the DRY principle.
So, go agile there:
write your method with validation: it is responsible for loading a valid configuration (1)
if you ever need some parametrization, add it then (like a 'check' parameter, with a default value which preserves the old behavior of course)
(1) Of course, I don't advocate a single method to do all this at once... it's an organization matter: under the covers this method should call dedicated methods to organize the code :)
To deflect the question to a more basic one, each method should do as little as possible. So in your example, there should be a method that reads in the file, a method that extracts the necessary data from the file, another method to write that data to the collection, and another method that calls these methods. The validation can go in a separate method, or in one of the others, depending on where it makes the most sense.
private byte[] ReadFile(string fileSpec)
{
// code to read in file, and return contents
}
private FileData GetFileData(string fileContents)
{
// code to create FileData struct from file contents
}
private void FileDataCollection: Collection<FileData> { }
public void DoItAll (string fileSpec, FileDataCollection filDtaCol)
{
filDtaCol.Add(GetFileData(ReadFile(fileSpec)));
}
Add validation, verification to each of the methods as appropriate
You are designing an API and should not make any unnecessary assumptions about your client. A method should take only the information that it needs, return only the information requested, and only fail when it is unable to return a meaningful value.
So, with that in mind, if the configuration is loadable but empty, then returning an empty list seems correct to me. If your client has an application specific requirement to fail when provided an empty list, then it may do so, but future clients may not have that requirement. The loadConfiguration method itself should fail when it really fails, such as when it is unable to read or parse the file.
But you can continue to decouple your interface. For example, why must the configuration be stored in a file? Why can't I provide a URL, a row in a database, or a raw string containing the configuration data? Very few methods should take a file path as an argument since it binds them tightly to the local file system and makes them responsible for opening, reading, and closing files in addition to their core logic. Consider accepting an input stream as an alternative. Or if you want to allow for elaborate alternatives -- like data from a database -- consider accepting a ConfigurationReader interface or similar.
Methods should be highly cohesive ... that is single minded. So my opinion would be to separate the responsibilities as you have described. I sometimes feel tempted to say...it is just a short method so it does not matter...then I regret it 1.5 weeks later.
I think this depends on the case: If you could think of a scenario where you would use this method and it returned an empty list, and this would be okay, then I would not put the validation inside the method. But for e.g. a method which inserts data into a database which have to be validated (is the email address correct, has a name been specified, ... ) then it should be ok to put validation code inside the function and throw an exception.
Another alternative, not mentioned above, is to support Dependency Injection and have the method client inject a validator. This would allow the preservation of the "strong" Resource Acquisition Is Initialization principle, that is to say Any Object which Loads Successfully is Ready For Business (Matthieu's mention of Fail Fast is much the same notion).
It also allows a resource implementation class to create its own low-level validators which rely on the structure of the resource without exposing clients to implementation details unnecessarily, which can be useful when dealing with multiple disparate resource providers such as Ryan listed.

Passing object references needlessly through a middleman

I often find myself needing reference to an object that is several objects away, or so it seems. The options I see are passing a reference through a middle-man or just making something available statically. I understand the danger of global scope, but passing a reference through an object that does nothing with it feels ridiculous. I'm okay with a little bit passing around, I suppose. I suspect there's a line to be drawn somewhere.
Does anyone have insight on where to draw this line?
Or a good way to deal with the problem of distributing references amongst dependent objects?
Use the Law of Demeter (with moderation and good taste, not dogmatically). If you're coding a.b.c.d.e, something IS wrong -- you've nailed forevermore the implementation of a to have a b which has a c which... EEP!-) One or at the most two dots is the maximum you should be using. But the alternative is NOT to plump things into globals (and ensure thread-unsafe, buggy, hard-to-maintain code!), it is to have each object "surface" those characteristics it is designed to maintain as part of its interface to clients going forward, instead of just letting poor clients go through such undending chains of nested refs!
This smells of an abstraction that may need some improvement. You seem to be violating the Law of Demeter.
In some cases a global isn't too bad.
Consider, you're probably programming against an operating system's API. That's full of globals, you can probably access a file or the registry, write to the console. Look up a window handle. You can do loads of stuff to access state that is global across the whole computer, or even across the internet... and you don't have to pass a single reference to your class to access it. All this stuff is global if you access the OS's API.
So, when you consider the number of global things that often exist, a global in your own program probably isn't as bad as many people try and make out and scream about.
However, if you want to have very nice OO code that is all unit testable, I suppose you should be writing wrapper classes around any access to globals whether they come from the OS, or are declared yourself to encapsulate them. This means you class that uses this global state can get references to the wrappers, and they could be replaced with fakes.
Hmm, anyway. I'm not quite sure what advice I'm trying to give here, other than say, structuring code is all a balance! And, how to do it for your particular problem depends on your preferences, preferences of people who will use the code, how you're feeling on the day on the academic to pragmatic scale, how big the code base is, how safety critical the system is and how far off the deadline for completion is.
I believe your question is revealing something about your classes. Maybe the responsibilities could be improved ? Maybe moving some code would solve problems ?
Tell, don't ask.
That's how it was explained to me. There is a natural tendency to call classes to obtain some data. Taken too far, asking too much, typically leads to heavy "getter sequences". But there is another way. I must admit it is not easy to find, but improves gradually in a specific code and in the coder's habits.
Class A wants to perform a calculation, and asks B's data. Sometimes, it is appropriate that A tells B to do the job, possibly passing some parameters. This could replace B's "getName()", used by A to check the validity of the name, by an "isValid()" method on B.
"Asking" has been replaced by "telling" (calling a method that executes the computation).
For me, this is the question I ask myself when I find too many getter calls. Gradually, the methods encounter their place in the correct object, and everything gets a bit simpler, I have less getters and less call to them. I have less code, and it provides more semantic, a better alignment with the functional requirement.
Move the data around
There are other cases where I move some data. For example, if a field moves two objects up, the length of the "getter chain" is reduced by two.
I believe nobody can find the correct model at first.
I first think about it (using hand-written diagrams is quick and a big help), then code it, then think again facing the real thing... Then I code the rest, and any smells I feel in the code, I think again...
Split and merge objects
If a method on A needs data from C, with B as a middle man, I can try if A and C would have some in common. Possibly, A or a part of A could become C (possible splitting of A, merging of A and C) ...
However, there are cases where I keep the getters of course.
But it's less likely a long chain will be created.
A long chain will probably get broken by one of the techniques above.
I have three patterns for this:
Pass the necessary reference to the object's constructor -- the reference can then be stored as a data member of the object, and doesn't need to be passed again; this implies that the object's factory has the necessary reference. For example, when I'm creating a DOM, I pass the element name to the DOM node when I construct the DOM node.
Let things remember their parent, and get references to properties via their parent; this implies that the parent or ancestor has the necessary property. For example, when I'm creating a DOM, there are various things which are stored as properties of the top-level DomDocument ancestor, and its child nodes can access those properties via the reference which each one has to its parent.
Put all the different things which are passed around as references into a single class, and then pass around just that one class instance as the only thing that's passed around. For example, there are many properties required to render a DOM (e.g. the GDI graphics handle, the viewport coordinates, callback events, etc.) ... I put all of these things into a single 'Context' instance which is passed as the only parameter to the methods of the DOM nodes to be rendered, and each method can get whichever properties it needs out of that context parameter.

Should a long method used only once be in its own class or in a function?

A lot of times in code on the internet or code from my co-workers I see them creating an Object with just one method which only gets used once in the whole application. Like this:
class iOnlyHaveOneMethod{
public function theOneMethod(){
//loads and loads of code, say 100's of lines
// but it only gets used once in the whole application
}
}
if($foo){
$bar = new iOnlyHaveOneMEthod;
$bar->theOneMethod();
}
Is that really better then:
if($foo){
//loads and loads of code which only gets used here and nowhere else
}
?
For readability it makes sense to move the loads and loads of code away, but shouldn't it just be in a function?
function loadsAndLoadsOfCode(){
//Loads and loads of code
}
if($foo){ loadsAndLoadsOfCode(); }
Is moving the code to a new object really better then just creating a function or putting the code in there directly?
To me the function part makes more sense and seems more readible then creating an object which hardly is of any use since it just holds one method.
The problem is not whether it's in a function or an object.
The problem is that you have hundreds of lines in one blob. Whether that mass of code is in a method of an object or just a class seems more or less irrelevant to me, just being minor syntatic sugar.
What are those hundreds of lines doing? That's the place to look to implement object oriented best practice.
If your other developers really think using an object instead of a function makes it significantly more "object oriented" but having a several-hundred line function/method isn't seen as a code smell, then I think organisationally you have some education to do.
Well, if there really is "loads and loads" of code in the method, then it should be broken down into several protected methods in that class, in which case the use of a class scope is justified.
Perhaps that code isn't reusable because it hasn't been factored well into several distinct methods. By moving it into a class and breaking it down, you might find it could be better reused elsewhere. At least it would be much more maintainable.
Whilst the function with hundreds of lines of code clearly indicates a problem (as others have already pointed out), placing it in a separate instance class rather than a static function does have advantages, which you can exploit by rejigging your example a fraction:
// let's instead assume that $bar was set earlier using a setter
if($foo){
$bar = getMyBar();
$bar->theOneMethod();
}
This gives you a couple of advantages now:
This is a simple example of the Strategy Pattern. if $bar implements an interface that provides theOneMethod() then you can dynamically switch implementations of that method;
Testing your class independently of $bar->theOneMethod() is dramatically easier, as you can replace $bar with a mock at testing time.
Neither of these advantages are available if you just use a static function.
I would argue that, whilst simple static functions have their place, non-trivial methods (as this clearly is by the 'hundreds of lines' comment) deserve their own instance anyway:
to separate concerns;
to aid testing;
to aid refactoring and reimplementation.
You are really asking two questions here:
Is just declaring a function better than creating an object to hold only this function?
Should any function contain "loads of code"?
The first part: If you want to be able to dynamically switch functions, you may need the explicit object encapsulation as a workaround in languages that cannot handle functions this way. Of course, having to allocate a new object, assign it to a variable, then call the function from that variable is a bit dumb when all you want to do is call a function.
The second part: Ideally not, but there is no clear definition of "loads", and it may be the appropriate thing to do in certain cases.
yes, the presences of loads and loads of code is a Code Smell.
I'd say you almost never want to have either a block or a method with loads of code in it -- doesn't matter if it's in it's own class or not.
Moving it to an object might be a first step in refactoring 'though - so it might make sense in that way. First move it to its own class and later split it down to several smaller methods.
Well, I'd say it depends on how tightly coupled the block of code is with the calling section of code.
If it's so tightly coupled, that I can't imagine it being used anywhere else, I'd prefer sticking it in a private method of the calling class. That way it won't be visible to other parts of your system, guaranteeing it won't be misused by others.
On the other hand, if the block of code is generic enough (email validation i.e.) to possibly be interesting in other parts of the system, I'd have no problem extracting that part into it's own class, and then consider that to be a utility class. Even if it means it will be a single-method class.
If your question was more in the lines of "what to do with hundreds and hundreds of lines of code", then you really need to be doing some refactoring.
As much as a single method with lots of code is a code smell. My first thought was to at least make the method static. No data in the class so no need for creating an object.
I think i would look to rephrase the question that you are asking. I think you want to ask the questions is my class supporting singles responsibility principle. Is there anyway to decompose the pieces of your class into seperate smaller pieces that might change independently of each other (data access and parsing, etc . .). Can you unit test your class easily . .
If you can say yes to the above items, i wouldn't worry about method versus new class as the whole point here is that you have readable, maintainable code.
In my team we have red flag if a class gets long (over x amount of lines) but that is just a heuristic as if you class has 2000 lines of codes it probably can get broken down and is probably not supporting SRP.
For testability, it is definitely better to break it out into a separate class with separate method(s). It is a whole lot easier to write unit tests for single methods than as part of an inline if statement in a code-behind file or whatnot.
That being said, I agree with everyone else that the method should be broken out into single responsibility methods instead of hundreds of lines of code. This too will make it more readable and easier to test. And hopefully, you might get some reuse out of some of the logic contained in that big mess of code.