To monkey-patch or not to? - oop

This is more general question then language-specific, altho I bumped into this problem while playing with python ncurses module. I needed to display locale characters and have them recognized as characters, so I just quickly monkey-patched few functions / methods from curses module.
This was what I call a fast and ugly solution, even if it works. And the changes were relativly small, so I can hope I haven't messed up anything. My plan was to find another solution, but seeing it works and works well, you know how it is, I went forward to other problems I had to deal with, and I'm sure if there's no bug in this I won't ever make it better.
The more general question appeared to me though - obviously some languages allow us to monkey-patch large chunks of code inside classes. If this is the code I only use for myself, or the change is small, it's ok. What if some other developer takes my code though, he sees that I use some well-known module, so he can assume it works as it's used to. Then, this method suddenly behaves diffrent then it should.
So, very subjective, should we use monkey patching, and if yes, when and how? How should we document it?
edit: for #guerda:
Monkey-patching is the ability to dynamicly change the behavior of some piece of code at the execution time, without altering the code itself.
A small example in Python:
import os
def ld(name):
print("The directory won't be listed here, it's a feature!")
os.listdir = ld
# now what happens if we call os.listdir("/home/")?
os.listdir("/home/")

Don't!
Especially with free software, you have all the possibilities out there to get your changes into the main distribution. But if you have a weakly documented hack in your local copy you'll never be able to ship the product and upgrading to the next version of curses (security updates anyone) will be very high cost.
See this answer for a glimpse into what is possible on foreign code bases. The linked screencast is really worth a watch. Suddenly a dirty hack turns into a valuable contribution.
If you really cannot get the patch upstream for whatever reason, at least create a local (git) repo to track upstream and have your changes in a separate branch.
Recently I've come across a point where I have to accept monkey-patching as last resort: Puppet is a "run-everywhere" piece of ruby code. Since the agent has to run on - potentially certified - systems, it cannot require a specific ruby version. Some of those have bugs that can be worked around by monkey-patching select methods in the runtime. These patches are version-specific, contained, and the target is frozen. I see no other alternative there.

I would say don't.
Each monkey patch should be an exception and marked (for example with a //HACK comment) as such so they are easy to track back.
As we all know, it is all to easy to leave the ugly code in place because it works, so why spend any more time on it. So the ugly code will be there for a long time.

I agree with David in that monkey patching production code is usually not a good idea.
However, I believe that for languages that support it, monkey patching is a very valuable tool for unit testing. It allows you to isolate the piece of code you need to test even when it has complex dependencies - for instance with system calls that cannot be Dependency Injected.

I think the question can't be addressed with a single definitive yes-no/good-bad answer - the differences between languages and their implementations have to be considered.
In Python, one needs to consider whether a class can be monkey-patched at all (see this SO question for discussion), which relates to Python's slightly less-OO implementation. So I'd be cautious and inclined to expend some effort looking for alternatives before monkey-patching.
In Ruby, OTOH, which was built to be OO down into the interpreter, classes can be modified irrespective of whether they're implemented in C or Ruby. Even Object (pretty much the base class of everything) is open to modification. So monkey-patching is rather more enthusiastically adopted as a technique in that community.

Related

Converting Actionscript syntax to Objective C

I have a game I wrote in Actionscript 3 I'm looking to port to iOS. The game has about 9k LOC spread across 150 classes, most of the classes are for data models, state handling and level generation all of which should be easy to port.
However, the thought of rejiggering the syntax by hand across all these files is none too appealing. Are there tools that can help me speed up this process?
I'm not looking for a magical tool here, nor am I looking for a cross compiler, I just want some help converting my source files.
I don't know of a tool, but this is the way I'd try and attack your problem if there really is a lot of (simple) code to convert. I'm sure my suggestion is not that useful on parts of the code that are very flash-specific (all the DisplayObject stuff?) and also not that useful on lots of your logic. But it would be fun to build! :-)
Partial automatic conversion should be possible, especially if the objects are just 'data containers', watch out for bringing too much as3-idiom over to objective-c though, it might not always be a good fit.
Unless you want to create your own (semi) parser for as3 you'd need some sort of a parser, apparently FlexPMD has one (never used it), and there probably are others.
After getting your hands on a parser you have to find some way of suggesting to the system what parts could be converted automatically. You could try and add rules to the parser/generator script for the general case. For more specific cases I'd use custom metadata on the actual class/property/method, assuming a real as3 parser would correctly parse those.
Now part of your work will shift from hand-converting files to hand-annotating files, but that might be ok for you.
Have the parser parse your classes and define actions based on your metadata that will determine what kind of objective-c class to generate. If you get this working it could at least get you all your classes, their simple properties and method signatures (getting the body of the methods converted might be a bit too much to ask but you could include it as a comment so you'd have a nice reference while hand-translating).
PS: if you make this into a one way process be very sure you don't need to re-generate it later - it would be bad if you find out that you have been modifying the generated code and somehow need to re-generate all those classes -- that would mean you'll have to redo all your hard work!
I've started putting a tool together to take the edge off the menial aspects of this process.
I'm trying to figure out if there's enough interest to make it clean and stable enough to release for others to use. I may just do it anyway.
http://meanwhileatthelab.blogspot.com.au/2012/08/automating-process-of-converting-as3-to.html
It's so far saving me a lot of time while porting one of my fairly large games from AS3 to objc.
Check out the Sparrow Framework. It's purported to be designed with Actionscript developers in mind, recreating classes that sort of emulate display list and things like that. You'll have to dive into some "rejiggering" for sure no matter what you do if you don't want to use the CS5 packager.
http://www.sparrow-framework.org/
even if some solution exists, note that architectural logic is DIFFERENT, and many more other details.
Anyway even if posible, You will have a strange hybrid.
I am coming back from WWDC2012, and the message is (as always..) performance anf great user experience.
So You should rewrite using a different programming model.

Is "You break it, you buy it" the best policy?

There is a subtle reason why it might not be good: Sometimes, the blame for breaking something really should be placed on the individual who wrote fragile code without automated tests, not the one who broke their code by making a should-be-unrelated change somewhere else.
One imaginable example is when someone programs against an interface in a way that assumes behavior specific to the implementation du jour, but not guaranteed by existing contracts. Then someone else makes a change to the implementation that fits in the contract, but breaks the depended-on code. No tests fail because no tests are written for the depended-on code. Who's really to blame?
The purpose of this isn't to blame people, but to understand responsibilities, and if "You break it, you buy it" is really such a good policy.
EDIT: I really worded this poorly. I meant it to be about how to write correct software with respect to dependencies, including hidden dependencies. I meant this to be a question of what a programmer's responsibility is to avoid bugs, not what to do when a surprise bug is found. BUT, since so many answers have been given already, I'll let the question stand as-is and indicate an answer accordingly.
I think you have nothing to gain and everything to lose by promoting an atmosphere of blaming and finger pointing. When something is broken, you should assign it to the best person to fix the problem, whether that is the last person to touch that area because he or she knows the area, or to the person who wrote it first so knows the design philosophy best, or even just the person without anything more pressing to do.
"You break it, you buy it" makes sense in terms of breaking builds, not more serious problems.
If you put the build into a state where it can't compile, or run basic tests, you are blocking other people's work. If you can't see a quick and simple fix (because you introduced a quick and simple bug) then just roll back your changes (perhaps with local copies of what you'd worked on in the meantime) and commit.
If the fact that you broke the build is ultimately due to a wider issue, then deal with that wider issue, whether by fixing it, reporting it, or assigning it.
Short term, the person who made the code-base unworkable quickly makes it workable again. Long term, the best person for the job (balancing different factors) does the job.
Fixing it is the purpose not laying the blame. Suppose the orginal author of the fragile code has moved on, who would own the problem? Suppose he or she is simply assigned to another project? The person who ran into the problem needs to be the one who owns it until it is fixed, he or she is the person currently there and currently assigned to the task of making changes to the application.
Now if you know the code was created with a problem that should be avoided in the future and the orginal developer is still there, it would be a good thing to let him or her know about the issue and why it caused a problem, but ultimately the person who ran into the problem is the one who will need to fix it to get his new code to work.
I would say that assigning ownership to the disaster prone may not always be the most productive strategy. It is likely its own reward.
The last person to touch it should be at fault, Refactoring is a large part of software development, if somebody touched the code and did not properly document, write and test the code than that is on them. As part of the estimate, the time to properly put the code in better shape than it was found should be included.
That being said, if something does not work, the whole team should take the fall.
In an environment where people have to work together, cooperation should be of greater importance than placing blame. If someone`s module is fragile and if his/her peers agree that something should be done about it, a good team-player would fix the problem; that is, he would write unit tests .etc
In any case, a programmer's own code is ultimately their responsibility, and if they can't handle the responsibility of making their code cooperate with that of others, then they rightfully must take the blame. But not before giving them a chance or two to clean up their act.

Compromising design & code quality to integrate with existing modules

Greetings!
I inherited a C#.NET application I have been extending and improving for a while now. Overall it was obviously a rush-job (or whoever wrote it was seemingly less competent than myself). The app pulls some data from an embedded device & displays and manipulates it. At the core is a communications thread in the main application form which executes a 600+ lines of code method which calls functions all over the place, implementing a state machine - lots of if-state-then-do type code. Interaction with the device is done by setting the state/mode globally and letting the thread do it's thing. (This is just one example of the badness of the code - overall it is not very OO-like, it reminds of the style of embedded C code the device firmware is written in).
My problem is that this piece of code is central to the application. The software, communications protocol or device firmware are not documented at all. Obviously to carry on with my work I have to interact with this code.
What I would like some guidance on, is whether it is worth scrapping this code & trying to piece together something more reasonable from the information I can reverse engineer? I can't decide! The reason I don't want to refactor is because the code already works, and changing it will surely be a long, laborious and unpleasant task. On the flip side, not refactoring means I have to sometimes compromise the design of other modules so that I may call my code from this state machine!
I've heard of "If it ain't broke don't fix it!", so I am wondering if it should apply when "it" is influencing the design of future code! Any advice would be appreciated!
Thanks!
Also, the longer you wait, the worse the codebase will smell. My suggestion would be first create a testsuite that you can evaluate your refactoring against. This makes it a lot easier to see if you are refactoring or just plain breaking things :).
I would definitely recommend you to refactor the code if you feel its junky. Yes, during the process of refactoring you may have some inconsistencies/problems at the start. But that is why we have iterations and testing. Since you are going to build up on this core engine in future, why not make the basement as stable as possible.
However, be very sure on what you are going to do. Because at times long lines of code does not necessarily mean evil. On the other hand they may be very efficient in running time. If/else blocks are not bad if you ask me, as they are very intelligent in branching from a microprocessor's perspective. So, you will have to be judgmental and very clear before you touch this.
But once you refactor the code, you will definitely have fine control over it. And don't forget to document it!! Tomorrow, someone might very well come and say about you on whatever you've told about this guy who have written that core code.
This depends on the constraints you are facing, it's a decision to be based on practical basis, not on theoretical ones. You need three things to consider.
Time: you need to have enough time to learn it, implement it, and test it, without too many other tasks interrupting you
Boss #1: if you are working for someone, he needs to know and approve the time and effort you will spend immediately, required to rebuild your solution
Boss #2: your boss also needs to know that the advantage of having new and clean software will come at the price of possible regressions, and therefore at the beginning of the deployment there may be unexpected bugs
If you have those three, then go ahead and refactor it. It will be surely be worth it!
First and foremost, get all the business logic out of the Form. Second, locate all the parts where the code interacts with the global state (e.g. accessing the embedded system). Delegate all this access to methods. Then, move these methods into a new class and create an instance in the class's constructor. Finally, inject an instance for the class to use.
Following these steps, you can move your embedded system logic ("existing module") to a wrapper class you write, so the interface can be nice and clean and more manageable. Then you can better tackle refactoring the monster method because there is less global state to worry about (only local state).
If the code works and you can integrate your part with minimal changes to it then let the code as it is and do your integration.
If the code is simply a big barrier in your way to add new functionality then it is best for you to refactor it.
Talk with other people that are responsible for the project, explain the situation, give an estimation explaining the benefits gained after refactoring the code and I'm sure (I hope) that the best choice will be made. It is best to speak about what you think, don't keep anything inside, especially if this affects your productivity, motivation etc.
NOTE: Usually rewriting code is out of the question but depending on situation and amount of code needed to be rewritten the decision may vary.
You say that this is having an impact on the future design of the system. In this case I would say it is broken and does need fixing.
But you do have to take into account the business requirements. Often reality gets in the way!
Would it be possible to wrap this code up in another class whose interface better suits how you want to take the system forward? (See adapter pattern)
This would allow you to move forward with your requirements without the poor design having an impact.
It gives you an interface that you understand which you could write some unit tests for. These tests can be based on what your design requires from this code. It ensures that your assumptions about what it is doing is correct. If you say that this code works, then any failing tests may be that your assumptions are incorrect.
Once you have these tests you can safely refactor - one step at a time, and when you have some spare time or when it is needed - as per business requirements.
Quite often I find the best way to truly understand a piece of code is to refactor it.
EDIT
On reflection, as this is one big method with multiple calls to the outside world, you are going to need some kind of inverse Adapter class to wrap this method. If you can inject dependencies into the method (see Dependency Inversion such that the method calls methods in your classes then you can route these to the original calls.

Agile practices to avoid deprecated code? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I am converting an open source Java library to C#, which has a number of methods and classes tagged as deprecated. This project is an opportunity to start with a clean slate, so I plan to remove them entirely. However, being new to working on larger projects, I am nervous that the situation will arise again. Since much of agile development revolves around making something work now and refactoring later if needed, it seems like deprecation of APIs must be a common problem. Are there preventative measures I can take to avoid/minimize API deprecation, even if I am not entirely sure of the future direction of a project?
I'm not sure there is much you can do. Requirements change, and if you absolutely have to make sure that clients of the API are not broken by newer API version, you'll have rely on simply deprecating code until you think that no-one is using the deprecated code.
Placing [Obsolete] attributes on code causes the compiler to create warnings if there are any references to the obsolete methods. This way clients of the API, if they are diligent about fixing their compiler warnings, can gradually move to the new methods without having everything break with the new version.
Its useful if you use the ObsoleteAttribute's override which takes a string:
[Obsolete("Foo is deprecated. Use Bar instead for munging widgets.")]
<frivolous>
Perhaps you could create a TimeBombAttribute:
[TimeBomb(new DateTime(2010,1,1), "Foo will blow up! Better use Bar, or else."]
In your code, reflect for methods with the timebomb attribute and throw KaboomException if they are called after the specified date. That'll make sure that after 1st January 2010 no-one is using the obsolete methods, and you can clean up your API nicely. :)
</frivolous>
As Matt says, the Obsolete attribute is your friend... but whenever you apply it, provide details of how to change calling code. That way you've got a lot better chance of people actually changing. You might also want to consider specifying which version you anticipate removing the method in (probably the next major release).
Of course, you should be diligent in making sure you don't call the obsolete code - particularly in sample code.
Since much of agile development revolves around making something work now and refactoring later if needed
That's not agile. It's cowboy coding disguised under the label of agile.
The ideal is that whatever you complete, is complete, according to whatever Definition of Done you have. Usually the DoD states something along the lines of "feature impelmented, tested and related code refactored". Of course, if you are working on a throwaway prototype, you can have a more relaxed DoD.
API modifications are a difficult beast. If they are only project-internal APIs you are modifying, the best way to go is to refactor early. If you need to change the internal API, just go ahead and change all API clients at the same time. This way the refactoring debt does not grow very large and you don't have to use deprecation.
For published APIs you probably have some source and binary compatibility guarantees you have to maintain, at least until the next major release or so. Marking the old APIs deprecated works while maintaining compatibility. As with internal APIs, you should fix your internal code as soon as possible to not use the deprecated APIs.
Matt's answer is solid advice. I just wanted to mention that intially you probably want to use something along the lines of:
[Obsolete("Please use ... instead ", false)]
Once you have the code ported, change the false to true and the compiler will then treat all the calls to the method as an error.
Watch Josh Bloch's "How to Design a Good API and Why It Matters"
Most important w/r/t deprecation is knowing that "when in doubt, leave it out." Watch the video for clarification, but it has to do with having to support what you provide forever. If you are realistically expecting that API to be reused, you're effectively setting your decisions in stone.
I think API design is a much trickier thing to do in an Agile fashion because you're expecting it to be reused probably in many different ways. You have to worry about breaking others that are dependent on you, and so while it can be done, it's tough to have the right design emerge without getting a quick turnaround from other teams. Of course deprecation is going to help here, but I think YAGNI is a lot better design heuristic when it comes to APIs.
I think deprecation of code is an inevitable byproduct of Agile processes like continuous refactoring and incremental development. So if you end up with deprecated code as you work on your project, that's not necessarily a bad thing--just a fact of life. Of course, you will probably find that, rather than deprecating code, you end up keeping a lot of code but refactoring it into different methods, classes, and so on.
So, bottom line: I wouldn't worry about deprecating code during Agile development. If it served its purpose for a while, you're doing the right thing.
The rule of thumb for API design is to focus on what it does, rather than how it does it. Once you know the end goal, figure out the absolute minimum input you need and use that. Avoid passing your own objects as parameters, pass only data.
Seperate configuration from execution. For exmaple, maybe you have an image encoder/decoder.
Instead of making a call like:
Encoder.Encode( bytes, width, height, compression_type, compression_ratio, palette, etc etc);
Make it
Encoder.setCompressionType(compression_type);
Encoder.setCompressionType(compression_ratio);
etc,etc
Encoder.Encode(bytes, width, height);
That way adding or removing settings is much less likely to break existing implementations.
For deprecation, there's basically 3 types of APIs: internal, external, and public.
Internal is when its only your team working on the code. Deprecating these APIs isn't a big deal. Your team is the only one using it, so they aren't around long, there's pressure to change them, people aren't afraid to change them, and people know how to change them.
External is when its the same code base, but different teams are using it. This might be some common libraries in a large company, or a popular open source library. The point is, people can choose the version of code they compile with. The ease of deprecating an API depends on the size of the organization and how well they communicate. IMO, its the deprecator's job to update old code, rather than mark it deprecated and let warnings fly throughout the code base. Why the deprecator instead of the deprecatee? Because the depcarator is in the know; they know what changed and why.
Those two cases are pretty easy. So long as there is backwards compatibility, you can generally do whatever you'd like, update the clients yourself, or convince the maintainers to do it.
Then there are public api's. These are basically external API's that the clients don't have much control over, such as a web API. These are incredibly hard to update or deprecate. Most won't notice its broken, won't have someone to fix it, won't get notifications that its changing, and will only fix it once its broken (after they've yelled at you for breaking it, over course).
I've had to do the above a few times, and it is such a chore. I think the best you can do is purposefully break it early, wait a bit, and then restore it. You send out the usual warnings and deprecations first, of course, but - trust me - nothing will happen until something breaks.
An idea I've yet to try is to let people register simple apps that run small tests. When you want to do an API update, you run the external tests and contact the affected people.
Another approach to be popular is to have clients depend on (web) services. There are constructs out there that allow you to version your services and allow clients to perform lookups. This adds a lot more moving parts and complexity into the equation, but can be helpful if you are looking at turning over a lot of versions, and having to support multiple versions in production.
This article does a good job of explaining the problem and an approach.

Are code generators bad?

I use MyGeneration along with nHibernate to create the basic POCO objects and XML mapping files. I have heard some people say they think code generators are not a good idea. What is the current best thinking? Is it just that code generation is bad when it generates thousands of lines of not understandable code?
Code generated by a code-generator should not (as a generalisation) be used in a situation where it is subsequently edited by human intervention. Some systems such the wizards on various incarnations of Visual C++ generated code that the programmer was then expected to edit by hand. This was not popular as it required developers to pick apart the generated code, understand it and make modifications. It also meant that the generation process was one shot.
Generated code should live in separate files from other code in the system and only be generated from the generator. The generated code code should be clearly marked as such to indicate that people shouldn't modify it. I have had occasion to do quite a few code-generation systems of one sort or another and All of the code so generated has something like this in the preamble:
-- =============================================================
-- === Foobar Module ===========================================
-- =============================================================
--
-- === THIS IS GENERATED CODE. DO NOT EDIT. ===
--
-- =============================================================
Code Generation in Action is quite a good book on the subject.
Code generators are great, bad code is bad.
Most of the other responses on this page are along the lines of "No, because often the generated code is not very good."
This is a poor answer because:
1) Generators are tool like anything else - if you misuse them, dont blame the tool.
2) Developers tend to pride themselves on their ability to write great code one time, but you dont use code generators for one off projects.
We use a Code Generation system for persistence in all our Java projects and have thousands of generated classes in production.
As a manager I love them because:
1) Reliability: There are no significant remaining bugs in that code. It has been so exhaustively tested and refined over the years than when debugging I never worry about the persistence layer.
2) Standardisation: Every developers code is identical in this respect so there is much less for a guy to learn when picking up a new project from a coworker.
3) Evolution: If we find a better way to do things we can update the templates and update 1000's of classes quickly and consistently.
4) Revolution: If we switch to a different persistence system in the future then the fact that every single persistent class has an exactly identical API makes my job far easier.
5) Productivity: It is just a few clicks to build a persistent object system from metadata - this saves thousands of boring developer hours.
Code generation is like using a compiler - on an individual case basis you might be able to write better optimised assembly language, but over large numbers of projects you would rather have the compiler do it for you right?
We employ a simple trick to ensure that classes can always be regenerated without losing customisations: every generated class is abstract. Then the developer extends it with a concrete class, adds the custom business logic and overrides any base class methods he wants to differ from the standard. If there is a change in metadata he can regenerate the abstract class at any time, and if the new model breaks his concrete class the compiler will let him know.
The biggest problem I've had with code generators is during maintenance. If you modify the generated code and then make a change to your schema or template and try to regenerate you can have problems.
One problem is if the tool doesn't allow you to protect changes you've made to the modified code then your changes will be overwritten.
Another problem I've seen, particularly with code generators in RSA for web services, if you change the generated code too much the generator will complain that there is a mismatch and refuse to regenerate the code. This can happen for something as simple as changing the type of a variable. Then you are stuck generating the code to a different project and merging the results back into your original code.
Code generators can be a boon for productivity, but there are a few things to look for:
Let you work the way you want to work.
If you have to bend your non-generated code to fit around the generated code, then you should probably choose a different approach.
Run as part of your regular build.
The output should be generated to an intermediates directory, and not be checked in to source control. The input must be checked in to source control, however.
No install
Ideally, you check the tool in to source control, too. Making people install things when preparing a new build machine is bad news. For example, if you branch, you want to be able to version the tools with the code.
If you must, make a single script that will take a clean machine with a copy of the source tree, and configure the machine as required. Fully automated, please.
No editing output
You shouldn't have to edit the output. If the output isn't useful enough as-is, then the tool isn't working for you.
Also, the output should clearly state that it is a generated file & should not be edited.
Readable output
The output should be written & formatted well. You want to be able to open the output & read it without a lot of trouble.
#line
Many languages support something like a #line directive, which lets you map the contents of the output back to the input, for example when producing compiler error messages or when stepping in the debugger. This can be useful, but it can also be annoying unless done really well, so it's not a requirement.
My stance is that code generators are not bad, but MANY uses of them are.
If you are using a code generator for time savings that writes good code, then great, but often times it is not optimized, or adds a lot of overhead, in those cases I think it is bad.
Code generation might cause you some grief if you like to mix behaviour into your classes. An equally productive alternative might be attributes/annotations and runtime reflection.
Compilers are code generators, so they are not inherently bad unless you only like to program in raw machine code.
I believe however that code generators should always completely encapsulate the generated code. I.e. you should never have to modify the generated code by hand, any change should be done by modifying the input to the generator and regenerate the code.
If its a mainframe cobol code generator that Fran Tarkenton is trying to sell you then absolutely yes!
I've written a few code generators before - and to be honest they saved my butt more than once!
Once you have a clearly defined object - collection - user control design, you can use a code generator to build the basics for you, allowing your time as a developer to be used more effectively in building the complex stuff, after all, who really wants to write 300+ public property declarations and variable instatiations? I'd rather get stuck into the business logic than all the mindless repetitive tasks.
The mistake many people make when using code generation is to edit the generated code. If you keep in mind that if you feel like you need to edit the code, you actually need to be editing the code generation tool it's a boon to productivity. If you are constantly fighting the code that gets generated it's going to end up costing productivity.
The best code generators I've found are those that allow you to edit the templates that generate the code. I really like Codesmith for this reason, because it's template-based and the templates are easily editable. When you find there is a deficiency in the code that gets generated, you just edit the template and regenerate your code and you are forever good after that.
The other thing that I've found is that a lot of code generators aren't super easy to use with a source control system. The way we've gotten around this is to check in the templates rather than the code and the only thing we check into source control that is generated is a compiled version of the generated code (DLL files, mostly). This saves you a lot of grief because you only have to check in a few DLLs rather than possibly hundreds of generated files.
Our current project makes heavy use of a code generator. That means I've seen both the "obvious" benefits of generating code for the first time - no coder error, no typos, better adherence to a standard coding style - and, after a few months in maintenance mode, the unexpected downsides. Our code generator did, indeed, improve our codebase quality initially. We made sure that it was fully automated and integrated with our automated builds. However, I would say that:
(1) A code generator can be a crutch. We have several massive, ugly blobs of tough-to-maintain code in our system now, because at one point in the past it was easier to add twenty new classes to our code generation XML file, than it was to do proper analysis and class refactoring.
(2) Exceptions to the rule kill you. We use the code generator to create several hundred Screen and Business Object classes. Initially, we enforced a standard on what methods could appear in a class, but like all standards, we started making exceptions. Now, our code generation XML file is a massive monster, filled with special-case snippets of Java code that are inserted into select classes. It's nearly impossible to parse or understand.
(3) Since so much of our code is generated, using values from a database, it's proven difficult for developers to maintain a consistent code base on their individual workstations (since there can be multiple versions of the database). Debugging and tracing through the software is a lot harder, and newbies to the team take much longer to figure out the "flow" of the code, because of the extra abstraction and implicit relationships between classes. IDE's cannot pick up relationships between two classes that communicate via a code-generated class.
That's probably enough for now. I think Code Generators are great as part of a developer's individual toolkit; a set of scripts that write out your boilerplate code make starting a project a lot easier. But Code Generators do not make maintenance problems go away.
In certain (not many) cases they are useful. Such as if you want to generate classes based on lookup-type data in the database tables.
Code generation is bad when it makes programming more difficult (IE, poorly generated code, or a maintenance nightmare), but they are good when they make programming more efficient.
They probably don't always generate optimal code, but depending on your need, you might decide that developer manhours saved make up for a few minor issues.
All that said, my biggest gripe with ORM code generators is that maintenance the generated code can be a PITA if the schema changes.
Code generators are not bad, but sometimes they are used in situations when another solution exists (ie, instantiating a million objects when an array of objects would have been more suitable and accomplished in a few lines of code).
The other situation is when they are used incorrectly, or coded badly. Too many people swear off code generators because they've had bad experiences due to bugs, or their misunderstanding of how to correctly configure it.
But in and of themselves, code generators are not bad.
-Adam
They are like any other tool. Some give beter results than others, but it is up to the user to know when to use them or not. A hammer is a terrible tool if you are trying to screw in a screw.
This is one of those highly contentious issues. Personally, I think code generators are really bad due to the unoptimized crap code most of them put out.
However, the question is really one that only you can answer. In a lot of organizations, development time is more important than project execution speed or even maintainability.
We use code generators for generating data entity classes, database objects (like triggers, stored procs), service proxies etc. Anywhere you see lot of repititive code following a pattern and lot of manual work involved, code generators can help. But, you should not use it too much to the extend that maintainability is a pain. Some issues also arise if you want to regenerate them.
Tools like Visual Studio, Codesmith have their own templates for most of the common tasks and make this process easier. But, it is easy to roll out on your own.
It can really become an issue with maintainability when you have to come back and cant understand what is going on in the code. Therefore many times you have to weigh how important it is to get the project done fast compared to easy maintainability
maintainability <> easy or fast coding process
I use My Generation with Entity Spaces and I don't have any issues with it. If I have a schema change I just regenerate the classes and it all works out just fine.
They serve as a crutch that can disable your ability to maintain the program long-term.
The first C++ compilers were code generators that spit out C code (CFront).
I'm not sure if this is an argument for or against code generators.
I think that Mitchel has hit it on the head.
Code generation has its place. There are some circumstances where it's more effective to have the computer do the work for you!
It can give you the freedom to change your mind about the implementation of a particular component when the time cost of making the code changes is small. Of course, it is still probably important to understand the output the code generator, but not always.
We had an example on a project we just finished where a number of C++ apps needed to communicate with a C# app over named pipes. It was better for us to use small, simple, files that defined the messages and have all the classes and code generated for each side of the transaction. When a programmer was working on problem X, the last thing they needed was to worry about the implentation details of the messages and the inevitable cache hit that would entail.
This is a workflow question. ASP.NET is a code generator. The XAML parsing engine actually generates C# before it gets converted to MSIL. When a code generator becomes an external product like CodeSmith that is isolated from your development workflow, special care must be taken to keep your project in sync. For example, if the generated code is ORM output, and you make a change to the database schema, you will either have to either completely abandon the code generator or else take advantage of C#'s capacity to work with partial classes (which let you add members and functionality to an existing class without inheriting it).
I personally dislike the isolated / Alt-Tab nature of generator workflows; if the code generator is not part of my IDE then I feel like it's a kludge. Some code generators, such as Entity Spaces 2009 (not yet released), are more integrated than previous generations of generators.
I think the panacea to the purpose of code generators can be enjoyed in precompilation routines. C# and other .NET languages lack this, although ASP.NET enjoys it and that's why, say, SubSonic works so well for ASP.NET but not much else. SubSonic generates C# code at build-time just before the normal ASP.NET compilation kicks in.
Ask your tools vendor (i.e. Microsoft) to support pre-build routines more thoroughly, so that code generators can be integrated into the workflow of your solutions using metadata, rather than manually managed as externally outputted code files that have to be maintained in isolation.
Jon
The best application of a code generator is when the entire project is a model, and all the project's source code is generated from that model. I am not talking UML and related crap. In this case, the project model also contains custom code.
Then the only thing developers have to care about is the model. A simple architectural change may result in instant modification of thousands of source code lines. But everything remains in sync.
This is IMHO the best approach. Sound utopic? At least I know it's not ;) The near future will tell.
In a recent project we built our own code generator. We generated all the data base stuff, and all the base code for our view and view controller classes. Although the generator took several months to build (mostly because this was the first time we had done this, and we had a couple of false starts) it paid for itself the first time we ran it and generated the basic framework for the whole app in about ten minutes.
This was all in Java, but Ruby makes an excellent code-writing language particularly for small, one-off type projects.
The best thing was the consistency of the code and the project organization. In addition you kind of have to think the basic framework out ahead of time, which is always good.
Code generators are great assuming it is a good code generator. Especially working c++/java which is very verbose.