Why don't analysis tools apply refactorings? - fxcop

I am using fxCop and NDepend a lot at the moment, and I keep seeing the items their reports generate which are "wrong"1 and wondering to myself, why can't these tools just go and make those fixes they are suggesting?
I get some are very hard to work out, but something like the fields should be marked readonly can very easily be applied with the information the tool has. However for me it means going to the tool, finding the item then placing the code in etc... Takes considerable time even for the smallest items.
I would even be happy if I had to confirm each change, in a similar to how CodeRush Xpress does with some it's refactorings.
So is there a reason why these tools do not do this?
1 Wrong is relative here, since something like the 1700 class of fxCop errors which are all about naming aren't bad code, but do make it harder for new developers to grapple the code.

Possibly because there isn't always -- or even most of the time -- a single, correct choice of refactoring to make. There usually are dozens of ways to refactor code so, that the amount of warnings will be reduced, but the one that is actually right for the project is something a developer should decide.

Rob, this is something we (the NDepend Team) are thinking for the long term. But touching code is a domain much more sensitive than just analyzing it. And as said Rytmis, often there is not only one single choice for refactoring.
Certainly the best option will be to let NDepend send its refactoring commands to a solid existing refactorer such as R#. But this is pure supposition at the moment.

Related

Code cleanup and optimization, where to start?

I want to optimize my angularjs frontend application and cleanup the code to provide better code quality.
I thought about bringing in more abstraction, since I implemented a lot of similiar looking, but slightly different controllers.
My question(s) are the following:
Are there common techniques to recognize bad code and optimize it?
How can one determine if code is either good, bad or redundant?
Where should one start, when trying to provide better code quality in
an existing software project?
To answer your questions:
Yes there are: by looking at the code itself experienced programmers can tell if the code has certain characteristics or not. Some metrics exist that could indicate alarm signals in terms of quality like "many dots" in object-oriented languages (same in Javascript) which indicates close coupling. Here is a comprehensive list.
By looking at it or as written before with static code analysis.
As others stated don't optimize or refactor just for the sake to have a good looking code base. When you need to touch existing code again to e.g. add a feature or fix a bug then start to look for code redundancies and many other signals that might indicate to refactor the code. Martin Fowler wrote an excellent book about it with step-by-step examples which IMO is a must-read for every developer. Also a good starting point is Misko's site. He talks about testability but "good" code is well testable.
What's really important before refactoring is to have a strong automated test base to rely on. If not add tests and go really slow to make sure you don't break existing functionality.
The topic is really huge and impossible to work through in a post here but I think it's one of the most important ones that makes an experienced programmer.
If a code is good or bad is your own opinion.
To make the code look better and more efficient I would do something like this:
Don't make the lines to long.
Use variables that make sense.
Use tabs and enter when it is too messy.
There are a whole lot more things to clean up your code, but these are just some examples.
If the code works - don't touch it :)
Then when you work on bug fixes or new features\changes - see if you can also gradually improve pieces of code you are working with. The more you work with the code the better understanding of overall picture you should get and opportunities to improve and optimize should become more obvious. (you should also continue learning from other sources - books, internet, other codebases)
There is now magic "one size fits all" solution :) but yes, you can start with simple style changes as suggested in the other answer.
The process you refer to is commonly known as Refactoring. There are a number of standard techniques for improving code; Martin Fowler's book "Refactoring" has a list, with examples.
Many popular IDEs have refactoring tools built-in.
One of the processes in agile development is known as "red/green/refactor". Red means your code doesn't pass its unit tests; green means it passes (i.e. it does what it's supposed to do), and "refactor" means you make it elegant, maintainable and clean. Because you have a unit test, you know the refactoring doesn't break the code.
Where to start is a tough question - I typically recommend refactoring when you're fixing bugs. You may as well write a unit test to expose the bug, and tidy up the code while fixing the bug. Because that module has a bug, it's likely to be high-risk, so you should improve the code quality.

When you write your code, do you deal with errors proactively or reactively?

In other words, do you spend time anticipating errors and writing code to get around these potential issues, or do you write the code as you see fit and then work through any errors on an issue by issue basis?
I've been thinking a lot about this lately and I'm very much a reactive person. I write my code, give it a whirl, go back correct error and repeat until application works as expected. However a friend of mine offered that he spends time thinking how each line is interpreted and fixes errors before they occur.
I must point out that re-active is pure PRE-live. I definitely make sure my application is working before it goes live.
There should always be a balance.
Too many error checking is slow and leads to garbage code. Not enough error checking makes your program crash on edge cases which is not very good to discover after having it shipped.
So you decide how reliable some piece of code should be and implement error checking accordingly. Some test utility can be not very reliable - less error checking. A COM server meant to be used by a third party search service in deep background should be super reliable - much more error checking.
I think asking this in isolation is kinda weird, and very subjective, however there are obviously a bunch of techniques that permit you to do each. I tend to use these two:
Test-driven development (this would seem to be proactive)
Strong, static typing (reactive, but part of a tight iterative development cycle, as in, it's enforced by my ML compiler, and I compile a lot)
Very occasionally I swerve into the world of formal verification of programs. That's definitely "reactive", but if you think a little more up-front, it tends to make the verification easier.
I must also say that I value a lot of up-front thought in programming. The easiest way to avoid bugs is to not write them in the first place. Sometimes it's inevitable, but often a little more time spent thinking about the problem can lead to better-quality solutions, and then the rest can be taken care of using the kinds of automated methods I talked about above.
I usually ask myself a bunch of what-ifs when coding, like
The user clicks the button, what if they didn't select a date?
The user is typing in the search box, what if they try to type html in there?
My label text depends on a value from a shared drive, what if it's not mapped?
and so on. By doing this I've found that when the application does go live, there are a ton fewer errors and I can focus on fixing more obscure bugs instead of correcting conditions that should have been in place to begin with.
I live by a simple principle when considering error-handling: garbage in, garbage out. If you don't want any garbage (e.g. invalid input) messing up your software, you have to find all the points in your software where it can get in and handle it. Of course, the more complicated your software is, the harder it is to find every point of entry, but I feel that the more you do up front the less reactive you will need to be later on.
I advocate the proactive approach.
I try to write the code in that style which results in maintainable and reliable code
I use the defensive programming techniques to prevent stupid errors in code due to my loss of attention and similar
I design the database model according to the fortress principle, SQL code checking for results after each singular operation
I think of potential problems that can happen with that part of the code and I account for that. Not for every possibility but for major ones I can think of right now.
This usually results in software operating rather smoothly. At times it even surprises me but that was the intended goal, so here we are.
IMHO, the word "Error" (or its loose synonym "bug") itself means that it is a program behavior that was not foreseen.
I usually try to design with all possible scenarios in mind. Of course, it is usually not possible to think of all possible cases. But thinking through and allowing for as many scenarios as possible is usually better than just getting something working as soon as possible. This saves a lot of time and effort debugging and redesigning the code. I often sit down with pen and paper for even the smallest of programing tasks before actually typing any code into my editor.
As I said, this will not eliminate all errors. For me it pays off many times over in terms of time spent debugging. Another benefit is that it results in a more solid and maintainable design with fewer bugfixing hacks and special cases added on later. But in any case, you will have to do a lot of debugging after the code is done.
This does not apply when all you want is a mockup or rapid prototype. Also practical constraints such as deadlines often makes a thorough evaluation difficult or impossible.
What kind of programming? It's impossible to answer this in any general way. (It's like asking "do you wear a helmet when playing?" -- well, playing what?)
At work, I'm working on a database-backed website. The requirements are strict, and if I don't anticipate how users will screw it up, I'm going to get a call at some odd hour of the day to fix it.
At home, I'm working on a program ... I don't even know what it'll do yet. I can't deal with 'errors' because I don't know what 'an error' is in this context, because I don't know what correct behavior is going to be. The entire purpose of the program can and frequently does change on a timescale of minutes to hours, so even a couple minutes spent thinking about errors this early is a complete waste of time. (It's even worse than browsing SO, since error-handling adds lines of code.)
I guess the only general answer is "I do what makes sense in terms of saving time in the long term", which is, after all, the whole reason to use machines to do work for us.

Do good tests enable sloppy coding?

Let's say you're coding, and you come across an opportunity for simple code resuse (e.g. pulling a common piece of code out to an accessible place like a Utility class or base class). You might find yourself thinking, "I know it's good to do this, but I have to get this done now, and if I need to make a change to this code, and forget to change it in the other place, my testing framework will let me know."
In other words, you let the awesome tests you (or another developer) has written to remind you to change the code in the other places too.
Is this a legitimate problem that we might find in ourselves or other developers?
You're asking whether unit tests encourage you to rely on them as a method of TODO list? Yes, but I don't think that's sloppy coding. You are, afterall, to start with unit tests failing and code to the test; if you refactor some code and then once again code to the test, that isn't sloppy coding -- it's doing what you're supposed to.
I think the problem with unit tests is simply that you can't cover every corner case in a unit test, and sometimes people assume that a working test means a working app, which isn't true.
In the example you provide, good tests are in fact enabling you to implement sloppy design, however in my experience, bad tests wouldn't have discouraged you from doing the same.
The fallacy in your argument centers around the premise that "getting this done now" means you will save time by implementing sloppy design. The truth of the matter is that you are incurring technical debt whether your tests are good or not. Making a change to that code is now a much more complex task, whether you have a good testing framework to remind you of that or not.
Although immature code may work fine
and be completely acceptable to the
customer, excess quantities will make
a program unmasterable, leading to
extreme specialization of programmers
and finally an inflexible product.
- Ward Cunningham
The strength of good testing practices may be in allowing you to incur that debt with some level of safety. As long as you continue to be aware that this area of the code is now weak, as a result of your choices, then it may be worth the tradeoff -- you ship your product sooner, at the cost of higher debt, with a lower risk of incurring bugs in the short run as a result.
If the tests are good and the code (sloppy or otherwise) pass them, all is good. It would be nice to have good code but sloppy working code is better than good broken code.
I don't use tests as my first option to finding the code that needs changes. I'll use my IDE's search (or refactoring) functionality and look for all the places that call the method in question.
The tests are just a nice addition in case I was accidentally sloppy or accidentally introduced a bug. Test don't make me sloppy from the start, they just reassure me once I think I'm done.
I would say that good tests enable you to fix sloppy coding.
You can certainly write incredibly sloppy code with or without tests. Unit testing makes it slightly easier to get away with it, but only in the short run.
If you have a set of logic copied in two places in your code (IMO the worst thing a developer can do), then you probably have inconsistent tests as well.
The most important job any programmer can do is ruthlessly refactor the code, removing ALL duplication. This almost always shows benefits on even a single iteration.
Why would you think if you had an error in copied code in 2 places that your tests would be any better?
It sounds more to me like sloppy developers and sloppy coding practices are what are leading to sloppy code in your example. The tests you described would prevent the sloppy code from ever getting to far.

What OOP coding practices should you always make time for?

I tend to do a lot of projects on short deadlines and with lots of code that will never be used again, so there's always pressure/temptation to cut corners. One rule I always stick to is encapsulation/loose coupling, so I have lots of small classes rather than one giant God class. But what else should I never compromise on?
Update - thanks for the great response. Lots of people have suggested unit testing, but I don't think that's really appropriate to the kind of UI coding I do. Usability / User acceptance testing seems much important. To reiterate, I'm talking about the BARE MINIMUM of coding standards for impossible deadline projects.
Not OOP, but a practice that helps in both the short and long run is DRY, Don't Repeat Yourself. Don't use copy/paste inheritance.
Not a OOP practice, but common sense ;-).
If you are in a hurry, and have to write a hack. Always add a piece of comment with the reasons. So you can trace it back and make a good solution later.
If you never had the time to come back, you always have the comment so you know, why the solution was chosen at the moment.
Use Source control.
No matter how long it takes to set up (seconds..), it will always make your life easier! (still it's not OOP related).
Naming. Under pressure you'll write horrible code that you won't have time to document or even comment. Naming variables, methods and classes as explicitly as possible takes almost no additional time and will make the mess readable when you must fix it. From an OOP point of view, using nouns for classes and verbs for methods naturally helps encapsulation and modularity.
Unit tests - helps you sleep at night :-)
This is rather obvious (I hope), but at the very least I always make sure my public interface is as correct as possible. The internals of a class can always be refactored later on.
no public class with mutable public variables (struct-like).
Before you know it, you refer to this public variable all over your code, and the day you decide this field is a computed one and must have some logic in it... the refactoring gets messy.
If that day is before your release date, it gets messier.
Think about the people (may even be your future self) who have to read and understand the code at some point.
Application of the single responsibility principal. Effectively applying this principal generates a lot of positive externalities.
Like everyone else, not as much OOP practices, as much as practices for coding that apply to OOP.
Unit test, unit test, unit test. Defined unit tests have a habit of keeping people on task and not "wandering" aimlessly between objects.
Define and document all hierarchical information (namespaces, packages, folder structures, etc.) prior to writing production code. This helps to flesh out object relations and expose flaws in assumptions related to relationships of objects.
Define and document all applicable interfaces prior to writing production code. If done by a lead or an architect, this practice can additionally help keep more junior-level developers on task.
There are probably countless other "shoulds", but if I had to pick my top three, that would be the list.
Edit in response to comment:
This is precisely why you need to do these things up front. All of these sorts of practices make continued maintenance easier. As you assume more risk in the kickoff of a project, the more likely it is that you will spend more and more time maintaining the code. Granted, there is a larger upfront cost, but building on a solid foundation pays for itself. Is your obstacle lack of time (i.e. having to maintain other applications) or a decision from higher up? I have had to fight both of those fronts to be able to adopt these kinds of practices, and it isn't a pleasant situation to be in.
Of course everything should be Unit tested, well designed, commented, checked into source control and free of bugs. But life is not like that.
My personal ranking is this:
Use source control and actually write commit comments. This way you have a tiny bit of documentation should you ever wonder "what the heck did I think when I wrote this?"
Write clean code or document. Clean well-written code should need little documentation, as it's meaning can be grasped from reading it. Hacks are a lot different. Write why you did it, what you do and what you'd like to do if you had the time/knowledge/motivation/... to do it right
Unit Test. Yes it's down on number three. Not because it's unimportant but because it's useless if you don't have the other two at least halfway complete. Writing Unit tests is another level of documentation what you code should be doing (among others).
Refactor before you add something. This might sound like a typical "but we don't have time for it" point. But as with many of those points it usually saves more time than it costs. At least if you have at least some experience with it.
I'm aware that much of this has already been mentioned, but since it's a rather subjective matter, I wanted to add my ranking.
[insert boilerplate not-OOP specific caveat here]
Separation of concerns, unit tests, and that feeling that if something is too complex it's probably not conceptualised quite right yet.
UML sketching: this has clarified and saved any amount of wasted effort so many times. Pictures are great aren't they? :)
Really thinking about is-a's and has-a's. Getting this right first time is so important.
No matter how fast a company wants it, I pretty much always try to write code to the best of my ability.
I don't find it takes any longer and usually saves a lot of time, even in the short-term.
I've can't remember ever writing code and never looking at it again, I always make a few passes over it to test and debug it, and even in those few passes practices like refactoring to keep my code DRY, documentation (to some degree), separation of concerns and cohesion all seem to save time.
This includes crating many more small classes than most people (One concern per class, please) and often extracting initialization data into external files (or arrays) and writing little parsers for that data... Sometimes even writing little GUIs instead of editing data by hand.
Coding itself is pretty quick and easy, debugging crap someone wrote when they were "Under pressure" is what takes all the time!
At almost a year into my current project I finally set up an automated build that pushes any new commits to the test server, and man, I wish I had done that on day one. The biggest mistake I made early-on was going dark. With every feature, enhancement, bug-fix etc, I had a bad case of the "just one mores" before I would let anyone see the product, and it literally spiraled into a six month cycle. If every reasonable change had been automatically pushed out it would have been harder for me to hide, and I would have been more on-track with regard to the stakeholders' involvement.
Go back to code you wrote a few days/weeks ago and spend 20 minutes reviewing your own code. With the passage of time, you will be able to determine whether your "off-the-cuff" code is organized well enough for future maintenance efforts. While you're in there, look for refactoring and renaming opportunities.
I sometimes find that the name I chose for a function at the outset doesn't perfectly fit the function in its final form. With refactoring tools, you can easily change the name early before it goes into widespread use.
Just like everybody else has suggested these recommendations aren't specific to OOP:
Ensure that you comment your code and use sensibly named variables. If you ever have to look back upon the quick and dirty code you've written, you should be able to understand it easily. A general rule that I follow is; if you deleted all of the code and only had the comments left, you should still be able to understand the program flow.
Hacks usually tend to be convoluted and un-intuitive, so some good commenting is essential.
I'd also recommend that if you usually have to work to tight deadlines, get yourself a code library built up based upon your most common tasks. This will allow you to "join the dots" rather than reinvent the wheel each time you have a project.
Regards,
Docta
An actual OOP practice I always make time for is the Single Responsibility Principle, because it becomes so much harder to properly refactor the code later on when the project is "live".
By sticking to this principle I find that the code I write is easily re-used, replaced or rewritten if it fails to match the functional or non-functional requirements. When you end up with classes that have multiple responsibilities, some of them may fulfill the requirements, some may not, and the whole may be entirely unclear.
These kinds of classes are stressful to maintain because you are never sure what your "fix" will break.
For this special case (short deadlines and with lots of code that will never be used again) I suggest you to pay attention to embedding some script engine into your OOP code.
Learn to "refactor as-you-go". Mainly from an "extract method" standpoint. When you start to write a block of sequential code, take a few seconds to decide if this block could stand-alone as a reusable method and, if so, make that method immediately. I recommend it even for throw-away projects (especially if you can go back later and compile such methods into your personal toolbox API). It doesn't take long before you do it almost without thinking.
Hopefully you do this already and I'm preaching to the choir.

Are code generators bad?

I use MyGeneration along with nHibernate to create the basic POCO objects and XML mapping files. I have heard some people say they think code generators are not a good idea. What is the current best thinking? Is it just that code generation is bad when it generates thousands of lines of not understandable code?
Code generated by a code-generator should not (as a generalisation) be used in a situation where it is subsequently edited by human intervention. Some systems such the wizards on various incarnations of Visual C++ generated code that the programmer was then expected to edit by hand. This was not popular as it required developers to pick apart the generated code, understand it and make modifications. It also meant that the generation process was one shot.
Generated code should live in separate files from other code in the system and only be generated from the generator. The generated code code should be clearly marked as such to indicate that people shouldn't modify it. I have had occasion to do quite a few code-generation systems of one sort or another and All of the code so generated has something like this in the preamble:
-- =============================================================
-- === Foobar Module ===========================================
-- =============================================================
--
-- === THIS IS GENERATED CODE. DO NOT EDIT. ===
--
-- =============================================================
Code Generation in Action is quite a good book on the subject.
Code generators are great, bad code is bad.
Most of the other responses on this page are along the lines of "No, because often the generated code is not very good."
This is a poor answer because:
1) Generators are tool like anything else - if you misuse them, dont blame the tool.
2) Developers tend to pride themselves on their ability to write great code one time, but you dont use code generators for one off projects.
We use a Code Generation system for persistence in all our Java projects and have thousands of generated classes in production.
As a manager I love them because:
1) Reliability: There are no significant remaining bugs in that code. It has been so exhaustively tested and refined over the years than when debugging I never worry about the persistence layer.
2) Standardisation: Every developers code is identical in this respect so there is much less for a guy to learn when picking up a new project from a coworker.
3) Evolution: If we find a better way to do things we can update the templates and update 1000's of classes quickly and consistently.
4) Revolution: If we switch to a different persistence system in the future then the fact that every single persistent class has an exactly identical API makes my job far easier.
5) Productivity: It is just a few clicks to build a persistent object system from metadata - this saves thousands of boring developer hours.
Code generation is like using a compiler - on an individual case basis you might be able to write better optimised assembly language, but over large numbers of projects you would rather have the compiler do it for you right?
We employ a simple trick to ensure that classes can always be regenerated without losing customisations: every generated class is abstract. Then the developer extends it with a concrete class, adds the custom business logic and overrides any base class methods he wants to differ from the standard. If there is a change in metadata he can regenerate the abstract class at any time, and if the new model breaks his concrete class the compiler will let him know.
The biggest problem I've had with code generators is during maintenance. If you modify the generated code and then make a change to your schema or template and try to regenerate you can have problems.
One problem is if the tool doesn't allow you to protect changes you've made to the modified code then your changes will be overwritten.
Another problem I've seen, particularly with code generators in RSA for web services, if you change the generated code too much the generator will complain that there is a mismatch and refuse to regenerate the code. This can happen for something as simple as changing the type of a variable. Then you are stuck generating the code to a different project and merging the results back into your original code.
Code generators can be a boon for productivity, but there are a few things to look for:
Let you work the way you want to work.
If you have to bend your non-generated code to fit around the generated code, then you should probably choose a different approach.
Run as part of your regular build.
The output should be generated to an intermediates directory, and not be checked in to source control. The input must be checked in to source control, however.
No install
Ideally, you check the tool in to source control, too. Making people install things when preparing a new build machine is bad news. For example, if you branch, you want to be able to version the tools with the code.
If you must, make a single script that will take a clean machine with a copy of the source tree, and configure the machine as required. Fully automated, please.
No editing output
You shouldn't have to edit the output. If the output isn't useful enough as-is, then the tool isn't working for you.
Also, the output should clearly state that it is a generated file & should not be edited.
Readable output
The output should be written & formatted well. You want to be able to open the output & read it without a lot of trouble.
#line
Many languages support something like a #line directive, which lets you map the contents of the output back to the input, for example when producing compiler error messages or when stepping in the debugger. This can be useful, but it can also be annoying unless done really well, so it's not a requirement.
My stance is that code generators are not bad, but MANY uses of them are.
If you are using a code generator for time savings that writes good code, then great, but often times it is not optimized, or adds a lot of overhead, in those cases I think it is bad.
Code generation might cause you some grief if you like to mix behaviour into your classes. An equally productive alternative might be attributes/annotations and runtime reflection.
Compilers are code generators, so they are not inherently bad unless you only like to program in raw machine code.
I believe however that code generators should always completely encapsulate the generated code. I.e. you should never have to modify the generated code by hand, any change should be done by modifying the input to the generator and regenerate the code.
If its a mainframe cobol code generator that Fran Tarkenton is trying to sell you then absolutely yes!
I've written a few code generators before - and to be honest they saved my butt more than once!
Once you have a clearly defined object - collection - user control design, you can use a code generator to build the basics for you, allowing your time as a developer to be used more effectively in building the complex stuff, after all, who really wants to write 300+ public property declarations and variable instatiations? I'd rather get stuck into the business logic than all the mindless repetitive tasks.
The mistake many people make when using code generation is to edit the generated code. If you keep in mind that if you feel like you need to edit the code, you actually need to be editing the code generation tool it's a boon to productivity. If you are constantly fighting the code that gets generated it's going to end up costing productivity.
The best code generators I've found are those that allow you to edit the templates that generate the code. I really like Codesmith for this reason, because it's template-based and the templates are easily editable. When you find there is a deficiency in the code that gets generated, you just edit the template and regenerate your code and you are forever good after that.
The other thing that I've found is that a lot of code generators aren't super easy to use with a source control system. The way we've gotten around this is to check in the templates rather than the code and the only thing we check into source control that is generated is a compiled version of the generated code (DLL files, mostly). This saves you a lot of grief because you only have to check in a few DLLs rather than possibly hundreds of generated files.
Our current project makes heavy use of a code generator. That means I've seen both the "obvious" benefits of generating code for the first time - no coder error, no typos, better adherence to a standard coding style - and, after a few months in maintenance mode, the unexpected downsides. Our code generator did, indeed, improve our codebase quality initially. We made sure that it was fully automated and integrated with our automated builds. However, I would say that:
(1) A code generator can be a crutch. We have several massive, ugly blobs of tough-to-maintain code in our system now, because at one point in the past it was easier to add twenty new classes to our code generation XML file, than it was to do proper analysis and class refactoring.
(2) Exceptions to the rule kill you. We use the code generator to create several hundred Screen and Business Object classes. Initially, we enforced a standard on what methods could appear in a class, but like all standards, we started making exceptions. Now, our code generation XML file is a massive monster, filled with special-case snippets of Java code that are inserted into select classes. It's nearly impossible to parse or understand.
(3) Since so much of our code is generated, using values from a database, it's proven difficult for developers to maintain a consistent code base on their individual workstations (since there can be multiple versions of the database). Debugging and tracing through the software is a lot harder, and newbies to the team take much longer to figure out the "flow" of the code, because of the extra abstraction and implicit relationships between classes. IDE's cannot pick up relationships between two classes that communicate via a code-generated class.
That's probably enough for now. I think Code Generators are great as part of a developer's individual toolkit; a set of scripts that write out your boilerplate code make starting a project a lot easier. But Code Generators do not make maintenance problems go away.
In certain (not many) cases they are useful. Such as if you want to generate classes based on lookup-type data in the database tables.
Code generation is bad when it makes programming more difficult (IE, poorly generated code, or a maintenance nightmare), but they are good when they make programming more efficient.
They probably don't always generate optimal code, but depending on your need, you might decide that developer manhours saved make up for a few minor issues.
All that said, my biggest gripe with ORM code generators is that maintenance the generated code can be a PITA if the schema changes.
Code generators are not bad, but sometimes they are used in situations when another solution exists (ie, instantiating a million objects when an array of objects would have been more suitable and accomplished in a few lines of code).
The other situation is when they are used incorrectly, or coded badly. Too many people swear off code generators because they've had bad experiences due to bugs, or their misunderstanding of how to correctly configure it.
But in and of themselves, code generators are not bad.
-Adam
They are like any other tool. Some give beter results than others, but it is up to the user to know when to use them or not. A hammer is a terrible tool if you are trying to screw in a screw.
This is one of those highly contentious issues. Personally, I think code generators are really bad due to the unoptimized crap code most of them put out.
However, the question is really one that only you can answer. In a lot of organizations, development time is more important than project execution speed or even maintainability.
We use code generators for generating data entity classes, database objects (like triggers, stored procs), service proxies etc. Anywhere you see lot of repititive code following a pattern and lot of manual work involved, code generators can help. But, you should not use it too much to the extend that maintainability is a pain. Some issues also arise if you want to regenerate them.
Tools like Visual Studio, Codesmith have their own templates for most of the common tasks and make this process easier. But, it is easy to roll out on your own.
It can really become an issue with maintainability when you have to come back and cant understand what is going on in the code. Therefore many times you have to weigh how important it is to get the project done fast compared to easy maintainability
maintainability <> easy or fast coding process
I use My Generation with Entity Spaces and I don't have any issues with it. If I have a schema change I just regenerate the classes and it all works out just fine.
They serve as a crutch that can disable your ability to maintain the program long-term.
The first C++ compilers were code generators that spit out C code (CFront).
I'm not sure if this is an argument for or against code generators.
I think that Mitchel has hit it on the head.
Code generation has its place. There are some circumstances where it's more effective to have the computer do the work for you!
It can give you the freedom to change your mind about the implementation of a particular component when the time cost of making the code changes is small. Of course, it is still probably important to understand the output the code generator, but not always.
We had an example on a project we just finished where a number of C++ apps needed to communicate with a C# app over named pipes. It was better for us to use small, simple, files that defined the messages and have all the classes and code generated for each side of the transaction. When a programmer was working on problem X, the last thing they needed was to worry about the implentation details of the messages and the inevitable cache hit that would entail.
This is a workflow question. ASP.NET is a code generator. The XAML parsing engine actually generates C# before it gets converted to MSIL. When a code generator becomes an external product like CodeSmith that is isolated from your development workflow, special care must be taken to keep your project in sync. For example, if the generated code is ORM output, and you make a change to the database schema, you will either have to either completely abandon the code generator or else take advantage of C#'s capacity to work with partial classes (which let you add members and functionality to an existing class without inheriting it).
I personally dislike the isolated / Alt-Tab nature of generator workflows; if the code generator is not part of my IDE then I feel like it's a kludge. Some code generators, such as Entity Spaces 2009 (not yet released), are more integrated than previous generations of generators.
I think the panacea to the purpose of code generators can be enjoyed in precompilation routines. C# and other .NET languages lack this, although ASP.NET enjoys it and that's why, say, SubSonic works so well for ASP.NET but not much else. SubSonic generates C# code at build-time just before the normal ASP.NET compilation kicks in.
Ask your tools vendor (i.e. Microsoft) to support pre-build routines more thoroughly, so that code generators can be integrated into the workflow of your solutions using metadata, rather than manually managed as externally outputted code files that have to be maintained in isolation.
Jon
The best application of a code generator is when the entire project is a model, and all the project's source code is generated from that model. I am not talking UML and related crap. In this case, the project model also contains custom code.
Then the only thing developers have to care about is the model. A simple architectural change may result in instant modification of thousands of source code lines. But everything remains in sync.
This is IMHO the best approach. Sound utopic? At least I know it's not ;) The near future will tell.
In a recent project we built our own code generator. We generated all the data base stuff, and all the base code for our view and view controller classes. Although the generator took several months to build (mostly because this was the first time we had done this, and we had a couple of false starts) it paid for itself the first time we ran it and generated the basic framework for the whole app in about ten minutes.
This was all in Java, but Ruby makes an excellent code-writing language particularly for small, one-off type projects.
The best thing was the consistency of the code and the project organization. In addition you kind of have to think the basic framework out ahead of time, which is always good.
Code generators are great assuming it is a good code generator. Especially working c++/java which is very verbose.