Comments are always wrong -- right, or no? [closed] - documentation

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Randomly ran across this blog on how one should never read comments, and thought to myself all the comments I've read that were either wrong, dated, or simply confusing. Should one simple never read comments and/or just use regex replace pattern to delete... :-) ...just kidding, but really, maybe not. At the very least seems like comments and the related code should be have timestamps. Agree, or no?
FYI, blog appears to be by this stackoverflow user: Nosredna

An inaccurate comment is just as serious of a bug as wrong code. If it is a persistent problem, I would say the project has serious issues. Big projects such as PostgreSQL (94M uncompressed in src directory) rely on accurate comments to help programmers understand things quickly. They also take comments very seriously.
edit:
Also, if you can not summarize what your function is doing, then who can? After you are done writing a function, writing the comment for it can be a test that you fully understand what is going on. If things are still a little muddy in your mind, it will become very apparent when you try write the comment. And this a good thing, that shows what still needs to be worked on.

In the real world, programming does not just mean writing code. It means writing code that another programmer (or yourself, in 3 months time) can understand and maintain efficiently.
Any programmer who tells you that comments/documentation are not worth the time it takes to write/maintain them has a number of problems:
His long term work rate (and that of any colleagues who have to work with his code) will be lower than it could be. People will waste a lot of time understanding code that could be described in a quick sentence.
Bug rates relating to his code will be higher than they could be, because all those assumptions and special cases that he forgot to mention will constantly trip people up. (How many times have you had to do something "unusual" in your code to make something work, forgotten to comment it to say why you did it that way, and then later on thought it looked like a bug and "fixed" it, only to find that you've broken things horribly because of a subtlety you didn't remember?)
If he is too lazy to write down how his code works or should be used, what other shortcuts is he taking? Does he bother to check for nulls and handle exceptions? Does he care about his interfaces being consistent? Will he bother to refactor the name of a method/variable if its underlying meaning changes? Usually bad commenting is just the tip of the iceberg.
Many such programmers are too focussed on "the cool stuff" (like optimising every last cycle out of their code, or finding a clever way to obfuscate a seemingly simple piece of code into a mindboggling template) to understand commercial realities (e.g. he may have trouble getting his code working and shipped to deadlines because he focusses on squeezing unnecessary performance out of it rather than just getting the job done)
How good are his designs? I do most of my design and thinking when I write down docs/comments. By trying to explain things to the reader of my comments, I flush out a lot of flaws and assumptions that I otherwise would miss. I would go so far as to say that I usually write comments (my intent/design) and then sync the code with them. In most cases if there is a bug in my code, it is the code that is wrong not the comment.
He has not realised that by using good variable naming, well designed naming prefixes/conventions, and documentation comments, he can make much better use of fabulous time-saving features of his IDE (auto-complete, intellisense help, etc), and code much faster, with lower defect rates.
Fundamentally, he probably doesn't understand how to write good comments:
1) Comments should act as a quick-read summary of a chunk of code. I can read one brief line of text instead of having to work out the meaning of many lines of code. If you find yourself reading the code to navigate it, it is poorly commented. I read the comments to navigate and understand the code, and I only read the code itself when I actually need to work on that specific bit of it.
2) Method/Class comments should describe to a caller everything they need to know to use that class, without having to look inside the "black box" of the code implementation. They will be able to quickly grasp how to use the class/method, and understand all the side effects and the "contract" that an API provides - what can be null and what must be supplied, and which exceptions will be thrown, etc. If you have to read the code to get this, it's either badly encapsulated or badly documented.
3) Use tools like my AtomineerUtils addin or Submain's GhostDoc, which mean that his documentation comments can be kept in sync with the code with very little effort.

Code comments can sometimes be invaluable. Many times I've struggled to determine the intent of code unsuccessfully. For code of any complexity, it can be helpful if the author documents their intent. Of course, well written unit tests can make intent clear as well.

Personally I don't write a whole lot of comments, but common types of comments I do write are:
(sketch) proofs that my code is correct
explanations why not to just do the obvious thing (e.g: optimization was required, the API I'm calling is shockingly misleading, the "obvious thing" led to a subtle bug)
a description of the input edge-case which necessitates special handling, thus explaining why the special handling code appears
a reference to the requirement or specification which mandates the behaviour implemented by a particular line of code
top-level description of a series of steps taken: ("first get the input", "now strip the wrapper", "finally parse it"), which aren't needed if the names of the functions called for each step are well-chosen and unambiguous in context, but I'm not going to write a do-nothing wrapper for a generic function just to give it a more specific name for one single use
documentation to be extracted automatically. This is a large proportion of comments by volume, but I'm not sure it "really counts".
things that could be part of the documented interface, but aren't because they might need to change in future, or because they're internal helper stuff that outsiders don't need. Hence, useful for the implementer to be aware of, but not for the user. Private class invariants are included in this - do you really want to read through the whole class every time you want to rely on "this member is non-null", or "size is no greater than capacity", or "access to this member must be synchronized"? Better to check the description of the member before assigning to it, to see whether you're allowed to. If someone doesn't have the discipline to avoid breaking a clearly commented invariant, well, no wonder all their comments are wrong...
Some of those things are handled by comprehensive tests, in the sense that if the tests pass then the code is correct (hah), if you have the test code open too then you can see the special input cases, and if some fool hides the comments, makes the same mistake I originally did, and refactors my code to do the obvious thing, then the tests will fail. That doesn't mean comments don't handle it better, in the context of reading through code. In none of these cases is reading the comment a substitute for reading the code, they're designed to highlight what's important but not obvious.
Of course in the case of auto-docs, it's perfectly reasonable, and probably more usable, to view the documentation separately.
When debugging, as opposed to making changes, it's way more valuable to let the code explain itself. Comments are less useful, especially those private invariants, because if you're debugging then there's a reasonable chance that whoever wrote the code+comments made a mistake, so the comments may well be wrong. On the plus side, if you see a comment that doesn't reflect what the code does, there's a reasonable chance you've found a bug. If you just see what the code does, it might be some time before you realise that what the code does isn't what it's supposed to do.
Sometimes good naming goes a long way towards replacing comments. However, names can be just as misleading as comments if they're wrong. And, let's be honest, frequently names are at least slightly misleading or ambiguous. I doubt that the author of that blog post would go to the extent of replacing all names with "var1, var2, func1, func2, class1, class2 ...", just to make sure that they weren't distracted by the original author's incorrect intentions for the code. And I don't think it's any truer to say that "comments are always wrong" than to say that "variable names are always wrong". They're written by the same person at the same time with the same purpose.

Some comments are useful and good, other comments are not.
Yes, comments and code should be timestamped, but you shouldn't timestamp it yourself. Your source control system should manage this for you, and you should be able to access this information (e.g. using cvs annotate or svn blame).

Comment why you are doing something and not what you did syntactically.

Related

Respecting Fellow Developers [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 13 years ago.
Improve this question
We've all been there. You have written some code and unit tests, the tests all pass, and the code is decent (nothing's perfect, right?). Then, someone who is sure that they know better than you comes along and decides to change your code or the interfaces to your code just because he/she does not like the variable/class names that you used. No "real" refactorings, no real optimizations, no real improvement -- just different words -- not necessarily better words, just different.
My real problem with this is that (a) its a waste of time and (b) it shows a blatant disrespect for the fellow developer that wrote the code in the first place.
My visceral response is to lash out, but that's counter productive. Instead, I though that I might wright a paragraph or two as sort of a "Charter" or "Philosophy" that is adopted for the project. I'm wondering if anyone else has tried this, and if so, was it successful?
After looking at the initial comments below (which are appreciated), I think that my problem is primarily that this change broke the build for code that was already working. So time needed to be spent to fix the code for what was (in my opinion) a non value-added change.
-- Thanks
...decides to change your code or the
interfaces to your code just because
he/she does not like the
variable/class names that you used.
My real problem with this is that (a)
its a waste of time and (b) it shows a
blatant disrespect for the fellow
developer that wrote the code in the
first place.
My visceral response is to lash out...
I see some VERY CONCERNING things in those statements.
naming is REALLY, REALLY important. It is worth rewriting code to get it correct.
It is not YOUR code
How is it disrespectful?
You are taking it too personally.
I once worked with someone who freaked out when I made changes to "his" code. His code as horrible; it was buggy and unmaintainable. He was always staying late, fighting fires and breaking things - basically a negative contributor. I rewrote all his bad code for a big piece of functionality for a project one weekend and when he came back in on monday had a hissy fit. I am not saying your stuff is horrible, but maybe you need to calm down and be more objective about it.
Don't take it so personally. Step back and think about it - maybe "your" code needed fixing
We might be able to give better answers if you posted the code and changes, or at least some better idea of the situation with an example or two.
EDIT:
After seeing the code change and finding out that the build was broken, I am going to have to change the tone of this answer. I understand Steve's frustration - and i agree - that is not a good change. It makes a specific typedef more general and not very descriptive any more.
While I think some of my points are valid, in this case it looks like the changes were not appropriate.
The issue of code "ownership" is irrelevant. If the code changes are useless then everyone on the team should not be happy. If they are good changes then everyone should be happy about it. If there is a difference of opinion then you all need to find a common ground.
Breaking the build is not a good thing.
Steve, sorry if I came down harsh - it looks like in this instance you are justified in your frustration, but not because it is "your" code.
One thing that might help in this sort of situation is to require code reviews for all changes. People are less likely to make pointless changes if someone else has to review it first. If they can actually convince another developer that their change should go in then maybe it isn't so pointless after all.
Heey,
Guys! WE all need TO TALK!
Just sit together and TALK! There are always reasons to change and there are always reasons NOT to.
Decide together!
Don't just go to StackOverflow or a forum and say ask this kind of questions.
The new dev does it - he gets responses from community probably positive (yeahh, bad code should be refactored).
The current dev does it - he gets responses from the community too: "What an idiot could do such kind of changes!"
And the result is: Counterproductive, destructive, offensive environment for a long time.
Who wants it?
Just put your arguments on the table and that's it.
New dev needs some introduction too.
Old dev needs to listed TOO.
This should be collaborative work AND not pissing each other off.
Decide together, talk, as THE TEAM.
And... better ask questions like "How is it better to refactor this?"
Cheers.
In any software development team with a size > 1, standardization is key. Not only for each developer to understand the other's code, but so that the people that come along in 2 years, 5 years and 10 years can look at any part of the code and see a clear and consistent pattern. Will you... and the rest of the team... seriously still be there, working on this project, years down the line?
If you both "just have your way of doing things" and there is no formal standard for the project/company, I suggest you work with the team and/or your boss to suggest a formal standard be adopted. There are many standards already published for the various environments that you can use either as the standard, or as a starting point.
If there is a formal standard, everyone on the team is obligated to follow it... no matter how much "better" they think their way is.
So much for the hard skills.
Now on the soft skills side...
You and your colleague need to develop a healthy relationship, or decide to work in different places. Tit-for-tats that result in people feeling that they want to lash out will make everyone unhappy, not to mention gravely jeopardize the project everyone is being paid to complete. Look for a person you both respect (maybe your boss, maybe a respected and level-headed senior member of the team, maybe HR if you have a good HR department). Explain to that person what the problem is and that it makes you feel unvalued and disrespected. Ask for help talking through the situation with your colleague and agreeing to a better way of working together.
Finally, you need to be open to the possibility that your colleague may be making subjectively correct changes, even if the manner he's doing it in offends you. Separate the correct coding from the correct interpersonal interactions. Do the right thing for the project.
Well if that guy is going to maintain your code, let him do whatever he wants to.
Just remember that it is not "your" code. The code belongs to the company for which you work for. You wrote the code and you got paid for it. Let the Management do whatever they want to do with it.
Don't take things personally, move on.
Sometimes, changing names might be justified. It can be confusing if half the project refers to a person's sex, and then you check in some new code that refers to gender or something. Okay, this might be a bad example as technically they are two different things and their meaning is most likely still obvious. But if a project's code uses two different terms to refer to the same concept, it can be confusing.
Usually I try to leave people's code alone, unless I have some justification for refactoring. Luckily the same seems to go for my colleagues, so no, I have not had the need for writing such a charter yet.
How about using an automated build system, so when this person changes the code and breaks something the team will get an alert about it. This solves your problem with having to waste your time fixing something broken by someone elses change to your code. This way everyone will know that so and so made a change and broke the build, and can see for themself. The rule is "dont break the build".
You should be discussing this with the person who did it, in a non-threatening manner.
I believe every developer should take responsibility and hence own some of the code, but not all of the code. I understand the code that I've written better (irrespective of how good/bad it is) than any other guy that has ever seen it. Therefore the changes I make will be faster and less prone to error.
I don't mind anybody changing the code I've written later on, but I have a couple of conditions:
If you change the code and that causes something else to break, you are responsible for fixing it, not me.
If I don't agree with the changes you made I will change it back to the way I want it since I have to take responsibility for this piece of code in future.
Not all developers should be making changes to all the code, all the time. Only some of the time, for the purpose of getting to know the code (sharing knowledge).
I've never worked for an employer that endorses a "everyone can change anything any time" policy. Developers own certain parts of the code and they are asked specifically to make changes/refactor based on a development democracy.
You touch my code and break something, (1) you better have a good reason for the change that all developers agree with and (2) you better not leave broken things broken or ask me to go do the clean-up for you UNLESS you're my superior. I will humbly submit if that's the case.
I agree with Laurence that code reviews might help. This is an issue of how your team should work together. What might help is the notion of Egoless Programming - in a nutshell, considering the code as a joint product of the team, and trying to make decisions for the sake of the code rather than because of the programmer's ego. Your teammate violated the fourth commandment of egoless programming - "Don't rewrite code without consultation."
Maybe if your team is made aware of these principles, things will improve. I would try this.
Perhaps not completely on topic, but .... If you have developers who have the time to make changes to code just because they don't like the variable names used, then maybe the conversation should be about whether you have too many developers and which ones should be shown the door ... or how you're going to justify to management the bloated staff you have, especially in the current economic circumstances!

Anyone else find naming classes and methods one of the most difficult parts in programming? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
So I'm working on this class that's supposed to request help documentation from a vendor through a web service. I try to name it DocumentRetriever, VendorDocRequester, DocGetter, but they just don't sound right. I ended up browsing through dictionary.com for half an hour trying to come up with an adequate word.
Start programming with bad names is like having a very bad hair day in the morning, the rest of the day goes downhill from there. Feel me?
What you are doing now is fine, and I highly recommend you stick with your current syntax, being:
context + verb + how
I use this method to name functions/methods, SQL stored procs, etc. By keeping with this syntax, it will keep your Intellisense/Code Panes much more neat. So you want EmployeeGetByID() EmployeeAdd(), EmployeeDeleteByID(). When you use a more grammatically correct syntax such as GetEmployee(), AddEmployee() you'll see that this gets really messy if you have multiple Gets in the same class as unrelated things will be grouped together.
I akin this to naming files with dates, you want to say 2009-01-07.log not 1-7-2009.log because after you have a bunch of them, the order becomes totally useless.
One lesson I have learned, is that if you can't find a name for a class, there is almost always something wrong with that class:
you don't need it
it does too much
A good naming convention should minimize the number of possible names you can use for any given variable, class, method, or function. If there is only one possible name, you'll never have trouble remembering it.
For functions and for singleton classes, I scrutinize the function to see if its basic function is to transform one kind of thing into another kind of thing. I'm using that term very loosely, but you'll discover that a HUGE number of functions that you write essentially take something in one form and produce something in another form.
In your case it sounds like your class transforms a Url into a Document. It's a little bit weird to think of it that way, but perfectly correct, and when you start looking for this pattern, you'll see it everywhere.
When I find this pattern, I always name the function xFromy.
Since your function transforms a Url into a Document, I would name it
DocumentFromUrl
This pattern is remarkably common. For example:
atoi -> IntFromString
GetWindowWidth -> WidthInPixelsFromHwnd // or DxFromWnd if you like Hungarian
CreateProcess -> ProcessFromCommandLine
You could also use UrlToDocument if you're more comfortable with that order. Whether you say xFromy or yTox is probably a matter of taste, but I prefer the From order because that way the beginning of the function name already tells you what type it returns.
Pick one convention and stick to it. If you are careful to use the same names as your class names in your xFromy functions, it'll be a lot easier to remember what names you used. Of course, this pattern doesn't work for everything, but it does work where you're writing code that can be thought of as "functional."
Sometimes there isn't a good name for a class or method, it happens to us all. Often times, however, the inability to come up with a name may be a hint to something wrong with your design. Does your method have too many responsibilities? Does your class encapsulate a coherent idea?
Thread 1:
function programming_job(){
while (i make classes){
Give each class a name quickly; always fairly long and descriptive.
Implement and test each class to see what they really are.
while (not satisfied){
Re-visit each class and make small adjustments
}
}
}
Thread 2:
while(true){
if (any code smells bad){
rework, rename until at least somewhat better
}
}
There's no Thread.sleep(...) anywhere here.
I do spend a lot of time as well worrying about the names of anything that can be given a name when I am programming. I'd say it pays off very well though. Sometimes when I am stuck I leave it for a while and during a coffee break I ask around a bit if someone has a good suggestion.
For your class I'd suggest VendorHelpDocRequester.
The book Code Complete by Steve Mcconnell has a nice chapter on naming variables/classes/functions/...
I think this is a side effect.
It's not the actual naming that's hard. What's hard is that the process of naming makes you face the horrible fact that you have no idea what the hell you're doing.
I actually just heard this quote yesterday, through the Signal vs. Noise blog at 37Signals, and I certainly agree with it:
"There are only two hard things in Computer Science: cache invalidation and naming things."
— Phil Karlton
It's good that it's difficult. It's forcing you to think about the problem, and what the class is actually supposed to do. Good names can help lead to good design.
Agreed. I like to keep my type names and variables as descriptive as possible without being too horrendously long, but sometimes there's just a certain concept that you can't find a good word for.
In that case, it always helps me to ask a coworker for input - even if they don't ultimately help, it usually helps me to at least explain it out loud and get my wheels turning.
I was just writing on naming conventions last month: http://caseysoftware.com/blog/useful-naming-conventions
The gist of it:
verbAdjectiveNounStructure - with Structure and Adjective as optional parts
For verbs, I stick to action verbs: save, delete, notify, update, or generate. Once in a while, I use "process" but only to specifically refer to queues or work backlogs.
For nouns, I use the class or object being interacted with. In web2project, this is often Tasks or Projects. If it's Javascript interacting with the page, it might be body or table. The point is that the code clearly describes the object it's interacting with.
The structure is optional because it's unique to the situation. A listing screen might request a List or an Array. One of the core functions used in the Project List for web2project is simply getProjectList. It doesn't modify the underlying data, just the representation of the data.
The adjectives are something else entirely. They are used as modifiers to the noun. Something as simple as getOpenProjects might be easily implemented with a getProjects and a switch parameter, but this tends to generate methods which require quite a bit of understanding of the underlying data and/or structure of the object... not necessarily something you want to encourage. By having more explicit and specific functions, you can completely wrap and hide the implementation from the code using it. Isn't that one of the points of OO?
More so than just naming a class, creating an appropriate package structure can be a difficult but rewarding challenge. You need to consider separating the concerns of your modules and how they relate to the vision of the application.
Consider the layout of your app now:
App
VendorDocRequester (read from web service and provide data)
VendorDocViewer (use requester to provide vendor docs)
I would venture to guess that there's a lot going on inside a few classes. If you were to refactor this into a more MVC-ified approach, and allow small classes to handle individual duties, you might end up with something like:
App
VendorDocs
Model
Document (plain object that holds data)
WebServiceConsumer (deal with nitty gritty in web service)
Controller
DatabaseAdapter (handle persistance using ORM or other method)
WebServiceAdapter (utilize Consumer to grab a Document and stick it in database)
View
HelpViewer (use DBAdapter to spit out the documention)
Then your class names rely on the namespace to provide full context. The classes themselves can be inherently related to application without needing to explicitly say so. Class names are simpler and easier to define as a result!
One other very important suggestion: please do yourself a favor and pick up a copy of Head First Design Patterns. It's a fantastic, easy-reading book that will help you organize your application and write better code. Appreciating design patterns will help you to understanding that many of the problems you encounter have already been solved, and you'll be able to incorporate the solutions into your code.
Leo Brodie, in his book "Thinking Forth", wrote that the most difficult task for a programmer was naming things well, and he stated that the most important programming tool is a thesaurus.
Try using the thesaurus at http://thesaurus.reference.com/.
Beyond that, don't use Hungarian Notation EVER, avoid abbreviations, and be consistent.
Best wishes.
In short:
I agree that good names are important, but I don't think you have to find them before implementing at all costs.
Of course its better to have a good name right from the start. But if you can't come up with one in 2 minutes, renaming later will cost less time and is the right choice from a productivity point of view.
Long:
Generally it's often not worth to think too long about a name before implementing. If you implement your class, naming it "Foo" or "Dsnfdkgx", while implementing you see what you should have named it.
Especially with Java+Eclipse, renaming things is no pain at all, as it carefully handles all references in all classes, warns you of name collisions, etc. And as long as the class is not yet in the version control repository, I don't think there's anything wrong with renaming it 5 times.
Basically, it's a question of how you think about refactoring. Personally, I like it, though it annoys my team mates sometimes, as they believe in never touch a running system. And from everything you can refactor, changing names is one of the most harmless things you can do.
Why not HelpDocumentServiceClient kind of a mouthful, or HelpDocumentClient...it doesn't matter it's a vendor the point is it's a client to a webservice that deals with Help documents.
And yes naming is hard.
There is only one sensible name for that class:
HelpRequest
Don't let the implementation details distract you from the meaning.
Invest in a good refactoring tool!
I stick to basics: VerbNoun(arguments). Examples: GetDoc(docID).
There's no need to get fancy. It will be easy to understand a year from now, whether it's you or someone else.
For me I don't care how long a method or class name is as long as its descriptive and in the correct library. Long gone are the days where you should remember where each part of the API resides.
Intelisense exists for all major languages. Therefore when using a 3rd party API I like to use its intelisense for the documentation as opposed to using the 'actual' documentation.
With that in mind I am fine to create a method name such as
StevesPostOnMethodNamesBeingLongOrShort
Long - but so what. Who doesnt use 24inch screens these days!
I have to agree that naming is an art. It gets a little easier if your class is following a certain "desigh pattern" (factory etc).
This is one of the reasons to have a coding standard. Having a standard tends to assist coming up with names when required. It helps free up your mind to use for other more interesting things! (-:
I'd recommend reading the relevant chapter of Steve McConnell's Code Complete (Amazon link) which goes into several rules to assist readability and even maintainability.
HTH
cheers,
Rob
Nope, debugging is the most difficult thing thing for me! :-)
DocumentFetcher? It's hard to say without context.
It can help to act like a mathematician and borrow/invent a lexicon for your domain as you go: settle on short plain words that suggest the concept without spelling it out every time. Too often I see long latinate phrases that get turned into acronyms, making you need a dictionary for the acronyms anyway.
The language you use to describe the problem, is the language you should use for the variables, methods, objects, classes, etc. Loosely, nouns match objects and verbs match methods. If you're missing words to describe the problem, you're also missing a full understanding (specification) of the problem.
If it's just choosing between a set of names, then it should be driven by the conventions you are using to build the system. If you've come to a new spot, uncovered by previous conventions, then it's always worth spending some effort on trying extend them (properly, consistently) to cover this new case.
If in doubt, sleep on it, and pick the first most obvious name, the next morning :-)
If you wake up one day and realize you were wrong, then change it right away.
Paul.
BTW: Document.fetch() is pretty obvious.
I find I have the most trouble in local variables. For example, I want to create an object of type DocGetter. So I know it's a DocGetter. Why do I need to give it another name? I usually end up giving it a name like dg (for DocGetter) or temp or something equally nondescriptive.
Don't forget design patterns (not just the GoF ones) are a good way of providing a common vocabulary and their names should be used whenever one fits the situation. That will even help newcomers that are familiar with the nomenclature to quickly understand the architecture. Is this class you're working on supposed to act like a Proxy, or even a Façade ?
Shouldn't the vendor documentation be the object? I mean, that one is tangible, and not just as some anthropomorphization of a part of your program. So, you might have a VendorDocumentation class with a constructor that fetches the information. I think that if a class name contains a verb, often something has gone wrong.
I definitely feel you. And I feel your pain. Every name I think of just seems rubbish to me. It all seems so generic and I want to eventually learn how to inject a bit of flair and creativity into my names, making them really reflect what they describe.
One suggestion I have is to consult a Thesaurus. Word has a good one, as does Mac OS X. That can really help me get my head out of the clouds and gives me a good starting place as well as some inspiration.
If the name would explain itself to a lay programmer then there's probably no need to change it.

What OOP coding practices should you always make time for?

I tend to do a lot of projects on short deadlines and with lots of code that will never be used again, so there's always pressure/temptation to cut corners. One rule I always stick to is encapsulation/loose coupling, so I have lots of small classes rather than one giant God class. But what else should I never compromise on?
Update - thanks for the great response. Lots of people have suggested unit testing, but I don't think that's really appropriate to the kind of UI coding I do. Usability / User acceptance testing seems much important. To reiterate, I'm talking about the BARE MINIMUM of coding standards for impossible deadline projects.
Not OOP, but a practice that helps in both the short and long run is DRY, Don't Repeat Yourself. Don't use copy/paste inheritance.
Not a OOP practice, but common sense ;-).
If you are in a hurry, and have to write a hack. Always add a piece of comment with the reasons. So you can trace it back and make a good solution later.
If you never had the time to come back, you always have the comment so you know, why the solution was chosen at the moment.
Use Source control.
No matter how long it takes to set up (seconds..), it will always make your life easier! (still it's not OOP related).
Naming. Under pressure you'll write horrible code that you won't have time to document or even comment. Naming variables, methods and classes as explicitly as possible takes almost no additional time and will make the mess readable when you must fix it. From an OOP point of view, using nouns for classes and verbs for methods naturally helps encapsulation and modularity.
Unit tests - helps you sleep at night :-)
This is rather obvious (I hope), but at the very least I always make sure my public interface is as correct as possible. The internals of a class can always be refactored later on.
no public class with mutable public variables (struct-like).
Before you know it, you refer to this public variable all over your code, and the day you decide this field is a computed one and must have some logic in it... the refactoring gets messy.
If that day is before your release date, it gets messier.
Think about the people (may even be your future self) who have to read and understand the code at some point.
Application of the single responsibility principal. Effectively applying this principal generates a lot of positive externalities.
Like everyone else, not as much OOP practices, as much as practices for coding that apply to OOP.
Unit test, unit test, unit test. Defined unit tests have a habit of keeping people on task and not "wandering" aimlessly between objects.
Define and document all hierarchical information (namespaces, packages, folder structures, etc.) prior to writing production code. This helps to flesh out object relations and expose flaws in assumptions related to relationships of objects.
Define and document all applicable interfaces prior to writing production code. If done by a lead or an architect, this practice can additionally help keep more junior-level developers on task.
There are probably countless other "shoulds", but if I had to pick my top three, that would be the list.
Edit in response to comment:
This is precisely why you need to do these things up front. All of these sorts of practices make continued maintenance easier. As you assume more risk in the kickoff of a project, the more likely it is that you will spend more and more time maintaining the code. Granted, there is a larger upfront cost, but building on a solid foundation pays for itself. Is your obstacle lack of time (i.e. having to maintain other applications) or a decision from higher up? I have had to fight both of those fronts to be able to adopt these kinds of practices, and it isn't a pleasant situation to be in.
Of course everything should be Unit tested, well designed, commented, checked into source control and free of bugs. But life is not like that.
My personal ranking is this:
Use source control and actually write commit comments. This way you have a tiny bit of documentation should you ever wonder "what the heck did I think when I wrote this?"
Write clean code or document. Clean well-written code should need little documentation, as it's meaning can be grasped from reading it. Hacks are a lot different. Write why you did it, what you do and what you'd like to do if you had the time/knowledge/motivation/... to do it right
Unit Test. Yes it's down on number three. Not because it's unimportant but because it's useless if you don't have the other two at least halfway complete. Writing Unit tests is another level of documentation what you code should be doing (among others).
Refactor before you add something. This might sound like a typical "but we don't have time for it" point. But as with many of those points it usually saves more time than it costs. At least if you have at least some experience with it.
I'm aware that much of this has already been mentioned, but since it's a rather subjective matter, I wanted to add my ranking.
[insert boilerplate not-OOP specific caveat here]
Separation of concerns, unit tests, and that feeling that if something is too complex it's probably not conceptualised quite right yet.
UML sketching: this has clarified and saved any amount of wasted effort so many times. Pictures are great aren't they? :)
Really thinking about is-a's and has-a's. Getting this right first time is so important.
No matter how fast a company wants it, I pretty much always try to write code to the best of my ability.
I don't find it takes any longer and usually saves a lot of time, even in the short-term.
I've can't remember ever writing code and never looking at it again, I always make a few passes over it to test and debug it, and even in those few passes practices like refactoring to keep my code DRY, documentation (to some degree), separation of concerns and cohesion all seem to save time.
This includes crating many more small classes than most people (One concern per class, please) and often extracting initialization data into external files (or arrays) and writing little parsers for that data... Sometimes even writing little GUIs instead of editing data by hand.
Coding itself is pretty quick and easy, debugging crap someone wrote when they were "Under pressure" is what takes all the time!
At almost a year into my current project I finally set up an automated build that pushes any new commits to the test server, and man, I wish I had done that on day one. The biggest mistake I made early-on was going dark. With every feature, enhancement, bug-fix etc, I had a bad case of the "just one mores" before I would let anyone see the product, and it literally spiraled into a six month cycle. If every reasonable change had been automatically pushed out it would have been harder for me to hide, and I would have been more on-track with regard to the stakeholders' involvement.
Go back to code you wrote a few days/weeks ago and spend 20 minutes reviewing your own code. With the passage of time, you will be able to determine whether your "off-the-cuff" code is organized well enough for future maintenance efforts. While you're in there, look for refactoring and renaming opportunities.
I sometimes find that the name I chose for a function at the outset doesn't perfectly fit the function in its final form. With refactoring tools, you can easily change the name early before it goes into widespread use.
Just like everybody else has suggested these recommendations aren't specific to OOP:
Ensure that you comment your code and use sensibly named variables. If you ever have to look back upon the quick and dirty code you've written, you should be able to understand it easily. A general rule that I follow is; if you deleted all of the code and only had the comments left, you should still be able to understand the program flow.
Hacks usually tend to be convoluted and un-intuitive, so some good commenting is essential.
I'd also recommend that if you usually have to work to tight deadlines, get yourself a code library built up based upon your most common tasks. This will allow you to "join the dots" rather than reinvent the wheel each time you have a project.
Regards,
Docta
An actual OOP practice I always make time for is the Single Responsibility Principle, because it becomes so much harder to properly refactor the code later on when the project is "live".
By sticking to this principle I find that the code I write is easily re-used, replaced or rewritten if it fails to match the functional or non-functional requirements. When you end up with classes that have multiple responsibilities, some of them may fulfill the requirements, some may not, and the whole may be entirely unclear.
These kinds of classes are stressful to maintain because you are never sure what your "fix" will break.
For this special case (short deadlines and with lots of code that will never be used again) I suggest you to pay attention to embedding some script engine into your OOP code.
Learn to "refactor as-you-go". Mainly from an "extract method" standpoint. When you start to write a block of sequential code, take a few seconds to decide if this block could stand-alone as a reusable method and, if so, make that method immediately. I recommend it even for throw-away projects (especially if you can go back later and compile such methods into your personal toolbox API). It doesn't take long before you do it almost without thinking.
Hopefully you do this already and I'm preaching to the choir.

Do you follow the naming convention of the original programmer?

If you take over a project from someone to do simple updates do you follow their naming convention? I just received a project where the previous programmer used Hungarian Notation everywhere. Our core product has a naming standard, but we've had a lot of people do custom reporting over the years and do whatever they felt like.
I do not have time to change all of the variable names already in the code though.
I'm inclined for readablity just to continue with their naming convention.
Yes, I do. It makes it easier to follow by the people who inherit it after you. I do try and clean up the code a little to make it more readable if it's really difficult to understand.
I agree suggest that leaving the code as the author wrote it is fine as long as that code is internally consistent. If the code is difficult to follow because of inconsistency, you have a responsibility to the future maintainer (probably you) to make it clearer.
If you spend 40 hours figuring out what a function does because it uses poorly named variables, etc., you should refactor/rename for clarity/add commentary/do whatever is appropriate for the situation.
That said, if the only issue is that the mostly consistent style that the author used is different from the company standard or what you're used to, I think you're wasting your time renaming everything. Also, you may loose a source of expertise if the original author is still available for questions because he won't recognize the code anymore.
If you're not changing all the existing code to your standard, then I'd say stick with the original conventions as long as you're changing those files. Mixing two styles of code in the same file is destroying any benefit that a consistent code style would have, and the next guy would have to constantly ask himself "who wrote this function, what's it going to be called - FooBar() or fooBar()?"
This kind of thing gets even trickier when you're importing 3rd party libraries - you don't want to rewrite them, but their code style might not match yours. So in the end, you'll end up with several different naming conventions, and it's best to draw clear lines between "our code" and "their code".
Often, making a wholesale change to a codebase just to conform with the style guide is just a way to introduce new bugs with little added value.
This means that either you should:
Update the code you're working on to conform to the guideline as you work on it.
Use the conventions in the code to aide future maintenance efforts.
I'd recommend 2., but Hungarian Notation makes my eyes bleed :p.
If you are maintaining code that others wrote and that other people are going to maintain after you, you owe it to everybody involved not to make gratuitous changes. When they go into the source code control system to see what you changed, they should see what was necessary to fix the problem you were working on, and not a million diffs because you did a bunch of global searches and replaces or reformatted the code to fit your favourite brace matching convention.
Of course, if the original code really sucks, all bets are off.
Generally, yes, I'd go for convention and readability over standards in this scenario. No one likes that answer, but it's the right thing to do to keep the code maintainable long-term.
When a good programmer's reading code, he should be able to parse the variable names and keep track of several in his head -- as long as their consistent, at least within the source file. But if you break that consistency, it will likely force the programmer reading the code to suffer some cognitive dissonance, which would then make it a bit harder to keep track of. It's not a killer -- good programmers will get through it, but they'll curse your name and probably post you on TheDailyWTF.
I certainly would continue to use the same naming convention, as it'll keep the code consistent (even if it is consistently ugly) and more readable than mixing variable naming conventions. Human brains seem to be rather good at pattern recognition and you don't really want to throw the brain a curveball by gratuitously breaking said pattern.
That said, I'm anything but a few of Hungarian Notation but if that's what you've got to work with...
If the file or project is already written using a consistent style then you should try to follow that style, even if it conflicts/contradicts your existing style. One of the main goals of a code style is consistency, so if you introduce a different style in to code that is already consistent (within itself) you loose that consistency.
If the code is poorly written and requires some level of cleanup in order to understand it then cleaning up the style becomes a more relevant option, but you should only do so if absolutely necessary (especially if there are no unit tests) as you run the possiblity of introducing unexpected breaking changes.
Absolutely, yes. The one case where I don't believe it's preferable to follow the original programmer's naming convention is when the original programmer (or subsequent devs who've modified the code since then) failed to follow any consistent naming convention.
Yes. I actually wrote this up in a standards doc. I created at my current company:
Existing code supersedes all other standards and practices (whether they are industry-wide standards or those found in this document). In most cases, you should chameleon your code to match the existing code in the same files, for two reasons:
To avoid having multiple, distinct styles/patterns within a single module/file (which contradict the purpose of standards and hamper maintainability).
The effort of refactoring existing code is prone to being unnecessarily more costly (time-consuming and conducive to introduction of new bugs).
Personally whenever I take over a project that has a different variable naming scheme I tend to keep the same scheme that was being used by the previous programmer. The only thing I do different is for any new variables I add, I put an underscore before the variable name. This way I can quickly see my variables and my code without having to go into the source history and comparing versions. But when it comes to me inheriting simply unreadable code or comments I will usually go through them and clean them up as best I can without re-writing the whole thing (It has come to that). Organization is key to having extensible code!
if I can read the code, I (try) to take the same conventions
if it's not readable anyway I need to refactor and thus changing it (depending on what its like) considerable
Depends. If I'm building a new app and stealing the code from a legacy app with crap variable naming, I'll refactor once I get it into my app.
Yes..
There is litte that's more frustrating then walking into an application that has two drasticly different styles. One project I reciently worked on had two different ways of manipulating files, two different ways to implement screens, two different fundimental structures. The second coder even went so far as to make the new features part of a dll that gets called from the main code. Maintence was nightmarish and I had to learn both paradigms and hope when I was in one section I was working with the right one.
When in Rome do as the Romans do.
(Except for index variables names, e.g. "iArrayIndex++". Stop condoning that idiocy.)
I think of making a bug fix as a surgical procedure. Get in, disturb as little as possible, fix it, get out, leave as little trace of your being there as possible.
I do, but unfortunately, there where several developers before me that did not live to this rule, so I have several naming conventions to choose from.
But sometimes we get the time to set things straight so in the end, it will be nice and clean.
If the code already has a consistent style, including naming, I try to follow it. If previous programmers were not consistent, then I feel free to apply the company standard, or my personal standards if there is not any company standard.
In either case I try to mark the changes I have made by framing them with comments. I know with todays CVS systems this is often not done, but I still prefer to do it.
Unfortunately, most of the time the answer is yes. Most of the time, the code does not follow good conventions so it's hard to follow the precedent. But for readability, it's sometimes necessary to go with the flow.
However, if it's a small enough of an application that I can refactor a lot of the existing code to "smell" better, then I'll do so. Or, if this is part of a larger re-write, I'll also begin coding with the current coding standards. But this is not usually the case.
If there's a standard in the existing app, I think it's best to follow it. If there is no standard (tabs and spaces mixed, braces everywhere... oh the horror), then I do what I feel is best and generally run the existing code through a formatting tool (like Vim). I'll always keep the capitalization style, etc of the existing code if there is a coherent style.
My one exception to this rule is that I will not use hungarian notation unless someone has a gun to my head. I won't take the time to rename existing stuff, but anything I add new isn't going to have any hungarian warts on it.

What is self-documenting code and can it replace well documented code? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I have a colleague who insists that his code doesn't need comments, it's "self documenting."
I've reviewed his code, and while it's clearer than code which I've seen others produce, I still disagree that self-documenting code is as complete and useful as well commented and documented code.
Help me understand his point of view.
What is self documenting code
Can it really replace well commented and documented code
Are there situations where it's better than well documented and commented code
Are there examples where code cannot possibly be self-documenting without comments
Maybe it's just my own limitations, but I don't see how it can be a good practice.
This is not meant to be an argument - please don't bring up reasons why well commented and documented code is high priority - there are many resources showing this, but they aren't convincing to my peer. I believe I need to more fully understand his perspective to convince him otherwise. Start a new question if you must, but don't argue here.
Also, those of you who are arguing against self documenting code -this is primarily to help me understand the perspective (ie, positive aspects) of self-documenting code evangelists.
Well, since this is about comments and code, let's look at some actual code. Compare this typical code:
float a, b, c; a=9.81; b=5; c= .5*a*(b^2);
To this self-documenting code, which shows what is being done:
const float gravitationalForce = 9.81;
float timeInSeconds = 5;
float displacement = (1 / 2) * gravitationalForce * (timeInSeconds ^ 2);
And then to this documented code, which better explains why it is being done:
/* compute displacement with Newton's equation x = vₒt + ½at² */
const float gravitationalForce = 9.81;
float timeInSeconds = 5;
float displacement = (1 / 2) * gravitationalForce * (timeInSeconds ^ 2);
And the final version of code as documentation with zero comments needed:
float computeDisplacement(float timeInSeconds) {
const float gravitationalForce = 9.81;
float displacement = (1 / 2) * gravitationalForce * (timeInSeconds ^ 2);
return displacement;
}
Here's an example of a poor commenting style:
const float a = 9.81; //gravitational force
float b = 5; //time in seconds
float c = (1/2)*a*(b^2) //multiply the time and gravity together to get displacement.
In the last example, comments are used when variables should have been descriptively named instead, and the results of an operation are summarized when we can clearly see what the operation is. I would prefer the self-documented second example to this any day, and perhaps that is what your friend is talking about when he says self-documented code.
I would say that it depends on the context of what you are doing. To me, the self-documented code is probably sufficient in this case, but a comment detailing the methodology behind what is behind done (in this example, the equation) is also useful.
In my opinion, any code should be self-documenting. In good, self-documented code, you don't have to explain every single line because every identifier (variable, method, class) has a clear semantic name. Having more comments than necessary actually makes it harder (!) to read the code, so if your colleague
writes documentation comments (Doxygen, JavaDoc, XML comments etc.) for every class, member, type and method AND
clearly comments any parts of the code that are not self-documenting AND
writes a comment for each block of code that explains the intent, or what the code does on a higher abstraction level (i.e. find all files larger than 10 MB instead of loop through all files in a directory, test if file size is larger than 10 MB, yield return if true)
his code and documentation is fine, in my opinion. Note that self-documented code does not mean that there should be no comments, but only that there should be no unnecessary comments. The thing is, however, that by reading the code (including comments and documentation comments) should yield an immediate understanding of what the code does and why. If the "self-documenting" code takes longer to understand than commented code, it is not really self-documenting.
The code itself is always going to be the most up-to-date explanation of what your code does, but in my opinion it's very hard for it to explain intent, which is the most vital aspect of comments. If it's written properly, we already know what the code does, we just need to know why on earth it does it!
Someone once said
1) Only write comments for code that's hard to understand.
2) Try not to write code that's hard to understand.
The idea behind "self-documenting" code is that the actual program logic in the code is trivially clear enough to explain to anyone reading the code not only what the code is doing but why it is doing it.
In my opinion, the idea of true self-documenting code is a myth. The code can tell you the logic behind what is happening, but it can't explain why it is being done a certain way, particularly if there is more than one way to solve a problem. For that reason alone it can never replace well commented code.
I think it's relevant to question whether a particular line of code is self-documenting, but in the end if you do not understand the structure and function of a slice of code then most of the time comments are not going to help. Take, for example, amdfan's slice of "correctly-commented" code:
/* compute displacement with Newton's equation x = v0t + ½at^2 */
const float gravitationalForce = 9.81;
float timeInSeconds = 5;
float displacement = (1 / 2) * gravitationalForce * (timeInSeconds ^ 2);
This code is fine, but the following is equally informative in most modern software systems, and explicitly recognizes that using a Newtonian calculation is a choice which may be altered should some other physical paradigm be more appropriate:
const float accelerationDueToGravity = 9.81;
float timeInSeconds = 5;
float displacement = NewtonianPhysics.CalculateDisplacement(accelerationDueToGravity, timeInSeconds);
In my own personal experience, there are very few "normal" coding situations where you absolutely need comments. How often do you end up rolling your own algorithm, for example? Basically everything else is a matter of structuring your system so that a coder can comprehend the structures in use and the choices which drove the system to use those particular structures.
I forget where I got this from, but:
Every comment in a program is like an apology to the reader. "I'm sorry that my code is so opaque that you can't understand it by looking at it". We just have to accept that we are not perfect but strive to be perfect and go right on apologizing when we need to.
First of all, it's good to hear that your colleague's code is in fact clearer than other code you have seen. It means that he's probably not using "self-documenting" as an excuse for being too lazy to comment his code.
Self-documenting code is code that does not require free-text comments for an informed reader to understand what it is doing. For example, this piece of code is self-documenting:
print "Hello, World!"
and so is this:
factorial n = product [1..n]
and so is this:
from BeautifulSoup import BeautifulSoup, Tag
def replace_a_href_with_span(soup):
links = soup.findAll("a")
for link in links:
tag = Tag(soup, "span", [("class", "looksLikeLink")])
tag.contents = link.contents
link.replaceWith(tag)
Now, this idea of an "informed reader" is very subjective and situational. If you or anyone else is having trouble following your colleague's code, then he'd do well to re-evaluate his idea of an informed reader. Some level of familiarity with the language and libraries being used must be assumed in order to call code self-documenting.
The best argument I have seen for writing "self-documenting code" is that it avoids the problem of free-text commentary not agreeing with the code as it is written. The best criticism is that while code can describe what and how it is doing by itself, it cannot explain why something is being done a certain way.
Self-documenting code is a good example of "DRY" (Don't Repeat Yourself). Don't duplicate information in comments which is, or can be, in the code itself.
Rather than explain what a variable is used for, rename the variable.
Rather than explain what a short snippet of code does, extract it into a method and give it a descriptive name (perhaps a shortened version of your comment text).
Rather than explain what a complicated test does, extract that into a method too and give it a good name.
Etc.
After this you end up with code that doesn't require as much explanation, it explains itself, so you should delete the comments which merely repeat information in the code.
This doesn't mean you have no comments at all, there is some information you can't put into the code such as information about intent (the "why"). In the ideal case the code and the comments complement each other, each adding unique explanatory value without duplicating the information in the other.
self-documenting code is a good practice and if done properly can easily convey the meaning of the code without reading too many comments. especially in situations where the domain is well understood by everyone in the team.
Having said that, comments can be very helpful for new comers or for testers or to generate documentation/help files.
self-documenting code + necessary comments will go a long way towards helping people across teams.
In order:
Self-documenting code is code that clearly expresses its intent to the reader.
Not entirely. Comments are always helpful for commentary on why a particular strategy was chosen. However, comments which explain what a section of code is doing are indicative of code that is insufficiently self-documenting and could use some refactoring..
Comments lie and become out of date. Code always tells is more likely to tell the truth.
I've never seen a case where the what of code couldn't be made sufficiently clear without comments; however, like I said earlier, it is sometimes necessary/helpful to include commentary on the why.
It's important to note, however, that truly self-documenting code takes a lot of self- and team-discipline. You have to learn to program more declaratively, and you have to be very humble and avoid "clever" code in favor of code that is so obvious that it seems like anyone could have written it.
When you read a "self-documenting code",
you see what it is doing,
but you cannot always guess why it is doing in that particular way.
There are tons of non-programming constraints
like business logic, security, user demands etc.
When you do maintenance, those backgorund information become very important.
Just my pinch of salt...
For one, consider the following snippet:
/**
* Sets the value of foobar.
*
* #foobar is the new vaue of foobar.
*/
public void setFoobar(Object foobar) {
this.foobar = foobar;
}
In this example you have 5 lines of comments per 3 lines of code. Even worse - the comments do not add anything which you can't see by reading the code. If you have 10 methods like this, you can get 'comment blindness' and not notice the one method that deviates from the pattern.
If course, a better version would have been:
/**
* The serialization of the foobar object is used to synchronize the qux task.
* The default value is unique instance, override if needed.
*/
public void setFoobar(Object foobar) {
this.foobar = foobar;
}
Still, for trivial code I prefer not having comments. The intent and the overall organization is better explained in a separate document outside of the code.
The difference is between "what" and "how".
You should document "what" a routine does.
You should not document "how" it does it, unless special cases (e.g. refer to a specific algorithm paper). That should be self-documented.
One thing that you may wish to point out to your colleague is that no matter how self-documenting his code is, if other alternate approaches were considered and discarded that information will get lost unless he comments the code with that information. Sometimes it's just as important to know that an alternative was considered and why it was decided against and code comments are most likely to survive over time.
self-documenting code normally uses variable names that match exactly what the code is doing so that it is easy to understand what is going on
However, such "self-documenting code" will never replace comments. Sometimes code is just too complex and self-documenting code is not enough, especially in the way of maintainability.
I once had a professor who was a firm believer in this theory
In fact the best thing I ever remember him saying is "Comments are for sissies"
It took all of us by surprise at first but it makes sense.
However, the situation is that even though you may be able to understand what is going on in the code but someone who is less experienced that you may come behind you and not understand what is going on. This is when comments become important. I know many times that we do not believe they are important but there are very few cases where comments are unnecessary.
I'm surprised that nobody has brought about "Literate Programming", a technique developed in 1981 by Donald E. Knuth of TeX and "The Art of Computer Programming" fame.
The premise is simple: since the code has to be understood by a human and comments are simply thrown away by the compiler, why not give everyone the thing they need - a full textual description of the intent of the code, unfettered by programming language requirements, for the human reader and pure code for the compiler.
Literate Programming tools do this by giving you special markup for a document that tells the tools what part should be source and what is text. The program later rips the source code parts out of the document and assembles a code file.
I found an example on the web of it: http://moonflare.com/code/select/select.nw or the HTML version http://moonflare.com/code/select/select.html
If you can find Knuth's book on it in a library (Donald E. Knuth, Literate Programming, Stanford, California: Center for the Study of Language and Information, 1992, CSLI Lecture Notes, no. 27.) you should read it.
That's self-documenting code, complete with reasoning and all. Even makes a nice document,
Everything else is just well written comments :-)
In a company where I worked one of the programmers had the following stuck to the top of her monitor.
"Document your code like the person who maintains it is a homocidal maniac who knows where you live."
I would like to offer one more perspective to the many valid answers:
What is source code? What is a programming language?
The machines don't need source code. They're happy running assembly. Programming languages are for our benefit. We don't want to write assembly. We need to understand what we are writing. Programming is about writing code.
Should you be able to read what you write?
Source code is not written in human language. It has been tried (for example FORTRAN) but it isn't completely successful.
Source code can't have ambiguity. That's why we have to put more structure in it than we do with text. Text only works with context, which we take for granted when we use text. Context in source code is always explisit. Think "using" in C#.
Most programming languages have redundancy so that the compiler can catch us when we aren't coherent. Other languages use more inference and try to eliminate that redundancy.
Type names, method names and variable names are not needed by the computers. They are used by us for referencing. The compiler doesn't understand semantics, that's for us to use.
Programming languages are a linguistic bridge between man and machine. It has to be writable for us and readable for them. Secondary demands are that it should be readable to us. If we are good at semantics where allowed and good at structuring the code, source code should be easy to read even for us. The best code doesn't need comments.
But complexity lurks in every project, you always have to decide where to put the complexity, and which camels to swallow. Those are the places to use comments.
Self documenting code is an easy opt out of the problem, that over time code, comment and documentation diverge. And it is a disciplining factor to write clear code (if you are that strict on yourself).
For me, these are the rules I try to follow:
Code should be as easy and clear to
read as possible.
Comments should give reasons for
design decisions I took, like: why
do I use this algorithm, or
limitations the code has, like: does
not work when ... (this should be
handled in a contract/assertion in
the code) (usually within the function/procedure).
Documentation should list usage
(calling converntions), side
effects, possible return values. It
can be extracted from code using
tools like jDoc or xmlDoc. It
therefore usually is outside the
function/procedure, but close to the
code it describes.
This means that all three means of documenting code live close together and therefore are more likely to be changed when the code changes, but do not overlap in what they express.
The real problem with the so-called self-documenting code is that it conveys what it actually does. While some comments may help someone understand the code better (e.g., algorithms steps, etc.) it is to a degree redundant and I doubt you would convince your peer.
However, what is really important in documentation is the stuff that is not directly evident from the code: underlying intent, assumptions, impacts, limitations, etc.
Being able to determine that a code does X from a quick glance is way easier than being able to determine that a code does not do Y. He has to document Y...
You could show him an example of a code that looks well, is obvious, but doesn't actually cover all the bases of the input, for example, and see if he finds it.
I think that self-documenting code is a good replacement for commenting. If you require comments to explain how or why code is the way it is, then you have a function or variable names that should be modified to be more explanatory. It can be down to the coder as to whether he will make up the shortfall with a comment or renaming some variables and functions and refactoring code though.
It can't really replace your documentation though, because documentation is what you give to others to explain how to use your system, rather than how it does things.
Edit: I (and probably everyone else) should probably have the provision that a Digital Signal Processing (DSP) app should be very well commented. That's mainly because DSP apps are essentially 2 for loops fed with arrays of values and adds/multiplies/etc said values... to change the program you change the values in one of the arrays... needs a couple of comments to say what you are doing in that case ;)
When writing mathematical code, I have sometimes found it useful to write long, essay-like comments, explaining the math, the notational conventions the code uses, and how it all fits together. We're talking hundreds of lines of documentation, here.
I try to make my code as self-documenting as possible, but when I come back to work on it after a few months, I really do need to read the explanation to keep from making a hash out of it.
Now, of course this kind of extreme measure isn't necessary for most cases. I think the moral of the story is: different code requires different amounts of documentation. Some code can be written so clearly that it doesn't need comments -- so write it that clearly and don't use comments there!
But lots of code does need comments to make sense, so write it as clearly as possible and then use as many comments as it needs...
I would argue - as many of you do - that to be truly self documenting, code needs to show some form of intent. But I'm surprised nobody mentioned BDD yet - Behavior Driven Development. Part of the idea is that you have automated tests (code) explaining the intent of your code, which is so difficult to make obvious otherwise.
Good domain modeling
+ good names (variabes, methods, classes)
+ code examples (unit tests from use cases)
= self documenting software
A couple of reasons why extra comments in addition to the code might be clearer:
The code you're looking at was generated automatically, and hence any edits to the code might be clobbered the next time the project is compiled
A less-than-straightforward implementation was traded off for a performance gain (unrolling a loop, creating a lookup table for an expensive calculation, etc.)
Its going to be all in what the team values in its documentation. I would suggest that documenting why/intent instead of how is important and this isn't always captured in self documenting code. get/set no these are obvious - but calculation, retrieval etc something of the why should be expressed.
Also be aware of difference in your team if you are comming from different nationalities. Differences in diction can creap into the naming of methods:
BisectionSearch
BinarySearch
BinaryChop
These three methods contributed from developers trained on 3 different continents do the same thing. Only by reading the comments that described the algorithm were we able to identify the duplication in our library.
For me reading code that needs comments is like reading text in the language I do not know. I see statement and I do not understand what it does or why - and I have to look at comments. I read a phrase and I need to look in dictionary to understand what it means.
It is usually easy to write code that self-documents what it does. To tell you why it does so comments are more suitable, but even here code can be better. If you understand your system on every level of abstraction, you should try organizing you code like
public Result whatYouWantToDo(){
howYouDoItStep1();
howYouDoItStep2();
return resultOfWhatYouHavDone;
}
Where method name reflects your intent and method body explains how you achieve your goal.
You anyway can not tell entire book in its title, so main abstractions of your system still have to be documented, as well as complex algorithms, non-trivial method contracts and artifacts.
If the code that your colleague produc is really self-documented - lucky you and him.
If you think that your colleagues code needs comments - it needs. Just open the most non-trivial place in it, read it once and see if you understood everything or not. If the code is self-documented - then you should. If not - ask your colleague a question about it, after he gives you an answer ask why that answer was not documented in comments or code beforehand. He can claim that code is self-document for such smart person as him, but he anyway has to respect other team members - if your tasks require understanding of his code and his code does not explain to you everything you need to understand - it needs comments.
Most documentation / comments serve towards assisting future code enhancers / developers hence making the code maintainable.
More often than not we would end up coming back to our module at a later time to add new features or optimize.
At that time it would be easier to understand the code by simply reading the comments than step through numerous breakpoints.
Besides i would rather spend time thinking for new logic than deciphering the existing.
I think what he might be getting at is that if comments explain what the code is doing it should be re-written to be clear what it's intent is. That's what he means by self documenting code. Often this can mean simply breaking up long function into logical smaller pieces with a descriptive function name.
That doesn't mean the code should not be commented. It means that comments should provide a reason why the code is written the way it is.
I believe that you should always strive to achieve self-documenting code because it does make it easier to read code. However, you have also got to be pragmatic about things.
For example, I usually add a comment to every class member (I use documentation comments for this). This describes what the member is supposed to do but not how it does it. I find that when I am reading through code, particularly old code, this helps me remember quickly what the member is for and I also find it easier than reading the code and working it out, particularly if the flow of code jumps around quite a bit.
This is just my opinion. I know of plenty of people who work without comments at all and say that they find this to be no problem. I have, however, asked somebody about a method they wrote six months previous and they have had to think for a few minutes to tell me exactly what it did. This isn't a problem if the method is commented.
Finally, you have to remember that comments are equally part of the system as code. As you refactor and change functionality you must also update your comments. This is one argument against using comments at all as they are worse than useless if they are incorrect.