Related
Greetings!
I inherited a C#.NET application I have been extending and improving for a while now. Overall it was obviously a rush-job (or whoever wrote it was seemingly less competent than myself). The app pulls some data from an embedded device & displays and manipulates it. At the core is a communications thread in the main application form which executes a 600+ lines of code method which calls functions all over the place, implementing a state machine - lots of if-state-then-do type code. Interaction with the device is done by setting the state/mode globally and letting the thread do it's thing. (This is just one example of the badness of the code - overall it is not very OO-like, it reminds of the style of embedded C code the device firmware is written in).
My problem is that this piece of code is central to the application. The software, communications protocol or device firmware are not documented at all. Obviously to carry on with my work I have to interact with this code.
What I would like some guidance on, is whether it is worth scrapping this code & trying to piece together something more reasonable from the information I can reverse engineer? I can't decide! The reason I don't want to refactor is because the code already works, and changing it will surely be a long, laborious and unpleasant task. On the flip side, not refactoring means I have to sometimes compromise the design of other modules so that I may call my code from this state machine!
I've heard of "If it ain't broke don't fix it!", so I am wondering if it should apply when "it" is influencing the design of future code! Any advice would be appreciated!
Thanks!
Also, the longer you wait, the worse the codebase will smell. My suggestion would be first create a testsuite that you can evaluate your refactoring against. This makes it a lot easier to see if you are refactoring or just plain breaking things :).
I would definitely recommend you to refactor the code if you feel its junky. Yes, during the process of refactoring you may have some inconsistencies/problems at the start. But that is why we have iterations and testing. Since you are going to build up on this core engine in future, why not make the basement as stable as possible.
However, be very sure on what you are going to do. Because at times long lines of code does not necessarily mean evil. On the other hand they may be very efficient in running time. If/else blocks are not bad if you ask me, as they are very intelligent in branching from a microprocessor's perspective. So, you will have to be judgmental and very clear before you touch this.
But once you refactor the code, you will definitely have fine control over it. And don't forget to document it!! Tomorrow, someone might very well come and say about you on whatever you've told about this guy who have written that core code.
This depends on the constraints you are facing, it's a decision to be based on practical basis, not on theoretical ones. You need three things to consider.
Time: you need to have enough time to learn it, implement it, and test it, without too many other tasks interrupting you
Boss #1: if you are working for someone, he needs to know and approve the time and effort you will spend immediately, required to rebuild your solution
Boss #2: your boss also needs to know that the advantage of having new and clean software will come at the price of possible regressions, and therefore at the beginning of the deployment there may be unexpected bugs
If you have those three, then go ahead and refactor it. It will be surely be worth it!
First and foremost, get all the business logic out of the Form. Second, locate all the parts where the code interacts with the global state (e.g. accessing the embedded system). Delegate all this access to methods. Then, move these methods into a new class and create an instance in the class's constructor. Finally, inject an instance for the class to use.
Following these steps, you can move your embedded system logic ("existing module") to a wrapper class you write, so the interface can be nice and clean and more manageable. Then you can better tackle refactoring the monster method because there is less global state to worry about (only local state).
If the code works and you can integrate your part with minimal changes to it then let the code as it is and do your integration.
If the code is simply a big barrier in your way to add new functionality then it is best for you to refactor it.
Talk with other people that are responsible for the project, explain the situation, give an estimation explaining the benefits gained after refactoring the code and I'm sure (I hope) that the best choice will be made. It is best to speak about what you think, don't keep anything inside, especially if this affects your productivity, motivation etc.
NOTE: Usually rewriting code is out of the question but depending on situation and amount of code needed to be rewritten the decision may vary.
You say that this is having an impact on the future design of the system. In this case I would say it is broken and does need fixing.
But you do have to take into account the business requirements. Often reality gets in the way!
Would it be possible to wrap this code up in another class whose interface better suits how you want to take the system forward? (See adapter pattern)
This would allow you to move forward with your requirements without the poor design having an impact.
It gives you an interface that you understand which you could write some unit tests for. These tests can be based on what your design requires from this code. It ensures that your assumptions about what it is doing is correct. If you say that this code works, then any failing tests may be that your assumptions are incorrect.
Once you have these tests you can safely refactor - one step at a time, and when you have some spare time or when it is needed - as per business requirements.
Quite often I find the best way to truly understand a piece of code is to refactor it.
EDIT
On reflection, as this is one big method with multiple calls to the outside world, you are going to need some kind of inverse Adapter class to wrap this method. If you can inject dependencies into the method (see Dependency Inversion such that the method calls methods in your classes then you can route these to the original calls.
Lately, I've been making use a lot of the Strategy Pattern along with the Factory Pattern. And I really mean a lot. I have a lot of "algorithms" for everything and factories that retrieve algorithms based on parameters.
Even though the code seems very extensible, and it is, having N factories seems a bit of an abuse.
I know this is pretty subjective, and we're talking without seeing code, but is this acceptable in real world code? Would you change something ?
OK- ask yourself a question. does/will this algorithm implementation ever change? If no then remove the strategy.
I am maintence.
I once was forced (by my pattern lovin' boss) to write a set of 16 "buffer interpreter tuxedo services" using an AbstractFactory and a double DAO pattern in C++ (no reflections, no code gen). All up it something like 20,000 lines of the nastiest code I've even seen (not the least because I didn't really know C++ when I started) and it took about three months.
Since my old boss has moved on I've rewritten them using good 'ole "straight up and down" procedural style C++, with couple of funky-macros... each service is like 60 lines of code, times 16... all up less than a 1000 lines of really SIMPLE code; so simple that even I can follow it.
Cheers. Keith.
Whenever I'm implementing code in this fashion, some questions I ask are:
what components do I need to substitute to test ?
what components will I expect users/admins to disable or substitute (e.g. via Spring configs or similar) ?
what components do I expect or suspect will not be required in the future due to (possibly) changing requirements ?
This all drives how I construct object or components (via factories) and how I implement algorithms. The above is vague, but (of course) the requirements can be similarly difficult to pin down. Without seeing your implementation and your requirements, I can't comment further, but the above should act as some guideline to help you determine whether what you've done is overkill.
If you're using the same design pattern all over the place, perhaps you should either switch to a language that has better support for what you're trying to do or rethink your code to be more idiomatic in your language of choice. After all, that's why we have more than one programming language.
Would depends on the kind of software I'm working on.
Maintenance asks for simple code and factories is NOT simple code.
But extensibility asks sometimes for factories...
So you have to take both in consideration.
Just have in mind that most of the time, you will have to maintain a source file MANY times a year and you will NOT have to extend it.
IMO, patterns should only be used when absolutely needed. If you think it can be handy in two years, you are better to use them... in two years.
How complex is the work the factory is handling? Does object creation really need to be abstracted to a different class? A variation of the factory method is having a simple, in-class factory. This really works best if any dependencies have already been injected.
For instance,
public class Customer
{
public Customer CreateNewCustomer()
{
// handle minimally complex create logic here
}
}
As far as Strategy overuse... Again, as #RichardOD explained, will the algorithm ever really change?
Always keep in mind the YAGNI principle. You Aren't Gonna Need It.
Can't you make an AbstractFactory instead off different standalone factories?
AlgorithmFactory creates the algorithms you need based on the concrete factory.
I tend to do a lot of projects on short deadlines and with lots of code that will never be used again, so there's always pressure/temptation to cut corners. One rule I always stick to is encapsulation/loose coupling, so I have lots of small classes rather than one giant God class. But what else should I never compromise on?
Update - thanks for the great response. Lots of people have suggested unit testing, but I don't think that's really appropriate to the kind of UI coding I do. Usability / User acceptance testing seems much important. To reiterate, I'm talking about the BARE MINIMUM of coding standards for impossible deadline projects.
Not OOP, but a practice that helps in both the short and long run is DRY, Don't Repeat Yourself. Don't use copy/paste inheritance.
Not a OOP practice, but common sense ;-).
If you are in a hurry, and have to write a hack. Always add a piece of comment with the reasons. So you can trace it back and make a good solution later.
If you never had the time to come back, you always have the comment so you know, why the solution was chosen at the moment.
Use Source control.
No matter how long it takes to set up (seconds..), it will always make your life easier! (still it's not OOP related).
Naming. Under pressure you'll write horrible code that you won't have time to document or even comment. Naming variables, methods and classes as explicitly as possible takes almost no additional time and will make the mess readable when you must fix it. From an OOP point of view, using nouns for classes and verbs for methods naturally helps encapsulation and modularity.
Unit tests - helps you sleep at night :-)
This is rather obvious (I hope), but at the very least I always make sure my public interface is as correct as possible. The internals of a class can always be refactored later on.
no public class with mutable public variables (struct-like).
Before you know it, you refer to this public variable all over your code, and the day you decide this field is a computed one and must have some logic in it... the refactoring gets messy.
If that day is before your release date, it gets messier.
Think about the people (may even be your future self) who have to read and understand the code at some point.
Application of the single responsibility principal. Effectively applying this principal generates a lot of positive externalities.
Like everyone else, not as much OOP practices, as much as practices for coding that apply to OOP.
Unit test, unit test, unit test. Defined unit tests have a habit of keeping people on task and not "wandering" aimlessly between objects.
Define and document all hierarchical information (namespaces, packages, folder structures, etc.) prior to writing production code. This helps to flesh out object relations and expose flaws in assumptions related to relationships of objects.
Define and document all applicable interfaces prior to writing production code. If done by a lead or an architect, this practice can additionally help keep more junior-level developers on task.
There are probably countless other "shoulds", but if I had to pick my top three, that would be the list.
Edit in response to comment:
This is precisely why you need to do these things up front. All of these sorts of practices make continued maintenance easier. As you assume more risk in the kickoff of a project, the more likely it is that you will spend more and more time maintaining the code. Granted, there is a larger upfront cost, but building on a solid foundation pays for itself. Is your obstacle lack of time (i.e. having to maintain other applications) or a decision from higher up? I have had to fight both of those fronts to be able to adopt these kinds of practices, and it isn't a pleasant situation to be in.
Of course everything should be Unit tested, well designed, commented, checked into source control and free of bugs. But life is not like that.
My personal ranking is this:
Use source control and actually write commit comments. This way you have a tiny bit of documentation should you ever wonder "what the heck did I think when I wrote this?"
Write clean code or document. Clean well-written code should need little documentation, as it's meaning can be grasped from reading it. Hacks are a lot different. Write why you did it, what you do and what you'd like to do if you had the time/knowledge/motivation/... to do it right
Unit Test. Yes it's down on number three. Not because it's unimportant but because it's useless if you don't have the other two at least halfway complete. Writing Unit tests is another level of documentation what you code should be doing (among others).
Refactor before you add something. This might sound like a typical "but we don't have time for it" point. But as with many of those points it usually saves more time than it costs. At least if you have at least some experience with it.
I'm aware that much of this has already been mentioned, but since it's a rather subjective matter, I wanted to add my ranking.
[insert boilerplate not-OOP specific caveat here]
Separation of concerns, unit tests, and that feeling that if something is too complex it's probably not conceptualised quite right yet.
UML sketching: this has clarified and saved any amount of wasted effort so many times. Pictures are great aren't they? :)
Really thinking about is-a's and has-a's. Getting this right first time is so important.
No matter how fast a company wants it, I pretty much always try to write code to the best of my ability.
I don't find it takes any longer and usually saves a lot of time, even in the short-term.
I've can't remember ever writing code and never looking at it again, I always make a few passes over it to test and debug it, and even in those few passes practices like refactoring to keep my code DRY, documentation (to some degree), separation of concerns and cohesion all seem to save time.
This includes crating many more small classes than most people (One concern per class, please) and often extracting initialization data into external files (or arrays) and writing little parsers for that data... Sometimes even writing little GUIs instead of editing data by hand.
Coding itself is pretty quick and easy, debugging crap someone wrote when they were "Under pressure" is what takes all the time!
At almost a year into my current project I finally set up an automated build that pushes any new commits to the test server, and man, I wish I had done that on day one. The biggest mistake I made early-on was going dark. With every feature, enhancement, bug-fix etc, I had a bad case of the "just one mores" before I would let anyone see the product, and it literally spiraled into a six month cycle. If every reasonable change had been automatically pushed out it would have been harder for me to hide, and I would have been more on-track with regard to the stakeholders' involvement.
Go back to code you wrote a few days/weeks ago and spend 20 minutes reviewing your own code. With the passage of time, you will be able to determine whether your "off-the-cuff" code is organized well enough for future maintenance efforts. While you're in there, look for refactoring and renaming opportunities.
I sometimes find that the name I chose for a function at the outset doesn't perfectly fit the function in its final form. With refactoring tools, you can easily change the name early before it goes into widespread use.
Just like everybody else has suggested these recommendations aren't specific to OOP:
Ensure that you comment your code and use sensibly named variables. If you ever have to look back upon the quick and dirty code you've written, you should be able to understand it easily. A general rule that I follow is; if you deleted all of the code and only had the comments left, you should still be able to understand the program flow.
Hacks usually tend to be convoluted and un-intuitive, so some good commenting is essential.
I'd also recommend that if you usually have to work to tight deadlines, get yourself a code library built up based upon your most common tasks. This will allow you to "join the dots" rather than reinvent the wheel each time you have a project.
Regards,
Docta
An actual OOP practice I always make time for is the Single Responsibility Principle, because it becomes so much harder to properly refactor the code later on when the project is "live".
By sticking to this principle I find that the code I write is easily re-used, replaced or rewritten if it fails to match the functional or non-functional requirements. When you end up with classes that have multiple responsibilities, some of them may fulfill the requirements, some may not, and the whole may be entirely unclear.
These kinds of classes are stressful to maintain because you are never sure what your "fix" will break.
For this special case (short deadlines and with lots of code that will never be used again) I suggest you to pay attention to embedding some script engine into your OOP code.
Learn to "refactor as-you-go". Mainly from an "extract method" standpoint. When you start to write a block of sequential code, take a few seconds to decide if this block could stand-alone as a reusable method and, if so, make that method immediately. I recommend it even for throw-away projects (especially if you can go back later and compile such methods into your personal toolbox API). It doesn't take long before you do it almost without thinking.
Hopefully you do this already and I'm preaching to the choir.
I use MyGeneration along with nHibernate to create the basic POCO objects and XML mapping files. I have heard some people say they think code generators are not a good idea. What is the current best thinking? Is it just that code generation is bad when it generates thousands of lines of not understandable code?
Code generated by a code-generator should not (as a generalisation) be used in a situation where it is subsequently edited by human intervention. Some systems such the wizards on various incarnations of Visual C++ generated code that the programmer was then expected to edit by hand. This was not popular as it required developers to pick apart the generated code, understand it and make modifications. It also meant that the generation process was one shot.
Generated code should live in separate files from other code in the system and only be generated from the generator. The generated code code should be clearly marked as such to indicate that people shouldn't modify it. I have had occasion to do quite a few code-generation systems of one sort or another and All of the code so generated has something like this in the preamble:
-- =============================================================
-- === Foobar Module ===========================================
-- =============================================================
--
-- === THIS IS GENERATED CODE. DO NOT EDIT. ===
--
-- =============================================================
Code Generation in Action is quite a good book on the subject.
Code generators are great, bad code is bad.
Most of the other responses on this page are along the lines of "No, because often the generated code is not very good."
This is a poor answer because:
1) Generators are tool like anything else - if you misuse them, dont blame the tool.
2) Developers tend to pride themselves on their ability to write great code one time, but you dont use code generators for one off projects.
We use a Code Generation system for persistence in all our Java projects and have thousands of generated classes in production.
As a manager I love them because:
1) Reliability: There are no significant remaining bugs in that code. It has been so exhaustively tested and refined over the years than when debugging I never worry about the persistence layer.
2) Standardisation: Every developers code is identical in this respect so there is much less for a guy to learn when picking up a new project from a coworker.
3) Evolution: If we find a better way to do things we can update the templates and update 1000's of classes quickly and consistently.
4) Revolution: If we switch to a different persistence system in the future then the fact that every single persistent class has an exactly identical API makes my job far easier.
5) Productivity: It is just a few clicks to build a persistent object system from metadata - this saves thousands of boring developer hours.
Code generation is like using a compiler - on an individual case basis you might be able to write better optimised assembly language, but over large numbers of projects you would rather have the compiler do it for you right?
We employ a simple trick to ensure that classes can always be regenerated without losing customisations: every generated class is abstract. Then the developer extends it with a concrete class, adds the custom business logic and overrides any base class methods he wants to differ from the standard. If there is a change in metadata he can regenerate the abstract class at any time, and if the new model breaks his concrete class the compiler will let him know.
The biggest problem I've had with code generators is during maintenance. If you modify the generated code and then make a change to your schema or template and try to regenerate you can have problems.
One problem is if the tool doesn't allow you to protect changes you've made to the modified code then your changes will be overwritten.
Another problem I've seen, particularly with code generators in RSA for web services, if you change the generated code too much the generator will complain that there is a mismatch and refuse to regenerate the code. This can happen for something as simple as changing the type of a variable. Then you are stuck generating the code to a different project and merging the results back into your original code.
Code generators can be a boon for productivity, but there are a few things to look for:
Let you work the way you want to work.
If you have to bend your non-generated code to fit around the generated code, then you should probably choose a different approach.
Run as part of your regular build.
The output should be generated to an intermediates directory, and not be checked in to source control. The input must be checked in to source control, however.
No install
Ideally, you check the tool in to source control, too. Making people install things when preparing a new build machine is bad news. For example, if you branch, you want to be able to version the tools with the code.
If you must, make a single script that will take a clean machine with a copy of the source tree, and configure the machine as required. Fully automated, please.
No editing output
You shouldn't have to edit the output. If the output isn't useful enough as-is, then the tool isn't working for you.
Also, the output should clearly state that it is a generated file & should not be edited.
Readable output
The output should be written & formatted well. You want to be able to open the output & read it without a lot of trouble.
#line
Many languages support something like a #line directive, which lets you map the contents of the output back to the input, for example when producing compiler error messages or when stepping in the debugger. This can be useful, but it can also be annoying unless done really well, so it's not a requirement.
My stance is that code generators are not bad, but MANY uses of them are.
If you are using a code generator for time savings that writes good code, then great, but often times it is not optimized, or adds a lot of overhead, in those cases I think it is bad.
Code generation might cause you some grief if you like to mix behaviour into your classes. An equally productive alternative might be attributes/annotations and runtime reflection.
Compilers are code generators, so they are not inherently bad unless you only like to program in raw machine code.
I believe however that code generators should always completely encapsulate the generated code. I.e. you should never have to modify the generated code by hand, any change should be done by modifying the input to the generator and regenerate the code.
If its a mainframe cobol code generator that Fran Tarkenton is trying to sell you then absolutely yes!
I've written a few code generators before - and to be honest they saved my butt more than once!
Once you have a clearly defined object - collection - user control design, you can use a code generator to build the basics for you, allowing your time as a developer to be used more effectively in building the complex stuff, after all, who really wants to write 300+ public property declarations and variable instatiations? I'd rather get stuck into the business logic than all the mindless repetitive tasks.
The mistake many people make when using code generation is to edit the generated code. If you keep in mind that if you feel like you need to edit the code, you actually need to be editing the code generation tool it's a boon to productivity. If you are constantly fighting the code that gets generated it's going to end up costing productivity.
The best code generators I've found are those that allow you to edit the templates that generate the code. I really like Codesmith for this reason, because it's template-based and the templates are easily editable. When you find there is a deficiency in the code that gets generated, you just edit the template and regenerate your code and you are forever good after that.
The other thing that I've found is that a lot of code generators aren't super easy to use with a source control system. The way we've gotten around this is to check in the templates rather than the code and the only thing we check into source control that is generated is a compiled version of the generated code (DLL files, mostly). This saves you a lot of grief because you only have to check in a few DLLs rather than possibly hundreds of generated files.
Our current project makes heavy use of a code generator. That means I've seen both the "obvious" benefits of generating code for the first time - no coder error, no typos, better adherence to a standard coding style - and, after a few months in maintenance mode, the unexpected downsides. Our code generator did, indeed, improve our codebase quality initially. We made sure that it was fully automated and integrated with our automated builds. However, I would say that:
(1) A code generator can be a crutch. We have several massive, ugly blobs of tough-to-maintain code in our system now, because at one point in the past it was easier to add twenty new classes to our code generation XML file, than it was to do proper analysis and class refactoring.
(2) Exceptions to the rule kill you. We use the code generator to create several hundred Screen and Business Object classes. Initially, we enforced a standard on what methods could appear in a class, but like all standards, we started making exceptions. Now, our code generation XML file is a massive monster, filled with special-case snippets of Java code that are inserted into select classes. It's nearly impossible to parse or understand.
(3) Since so much of our code is generated, using values from a database, it's proven difficult for developers to maintain a consistent code base on their individual workstations (since there can be multiple versions of the database). Debugging and tracing through the software is a lot harder, and newbies to the team take much longer to figure out the "flow" of the code, because of the extra abstraction and implicit relationships between classes. IDE's cannot pick up relationships between two classes that communicate via a code-generated class.
That's probably enough for now. I think Code Generators are great as part of a developer's individual toolkit; a set of scripts that write out your boilerplate code make starting a project a lot easier. But Code Generators do not make maintenance problems go away.
In certain (not many) cases they are useful. Such as if you want to generate classes based on lookup-type data in the database tables.
Code generation is bad when it makes programming more difficult (IE, poorly generated code, or a maintenance nightmare), but they are good when they make programming more efficient.
They probably don't always generate optimal code, but depending on your need, you might decide that developer manhours saved make up for a few minor issues.
All that said, my biggest gripe with ORM code generators is that maintenance the generated code can be a PITA if the schema changes.
Code generators are not bad, but sometimes they are used in situations when another solution exists (ie, instantiating a million objects when an array of objects would have been more suitable and accomplished in a few lines of code).
The other situation is when they are used incorrectly, or coded badly. Too many people swear off code generators because they've had bad experiences due to bugs, or their misunderstanding of how to correctly configure it.
But in and of themselves, code generators are not bad.
-Adam
They are like any other tool. Some give beter results than others, but it is up to the user to know when to use them or not. A hammer is a terrible tool if you are trying to screw in a screw.
This is one of those highly contentious issues. Personally, I think code generators are really bad due to the unoptimized crap code most of them put out.
However, the question is really one that only you can answer. In a lot of organizations, development time is more important than project execution speed or even maintainability.
We use code generators for generating data entity classes, database objects (like triggers, stored procs), service proxies etc. Anywhere you see lot of repititive code following a pattern and lot of manual work involved, code generators can help. But, you should not use it too much to the extend that maintainability is a pain. Some issues also arise if you want to regenerate them.
Tools like Visual Studio, Codesmith have their own templates for most of the common tasks and make this process easier. But, it is easy to roll out on your own.
It can really become an issue with maintainability when you have to come back and cant understand what is going on in the code. Therefore many times you have to weigh how important it is to get the project done fast compared to easy maintainability
maintainability <> easy or fast coding process
I use My Generation with Entity Spaces and I don't have any issues with it. If I have a schema change I just regenerate the classes and it all works out just fine.
They serve as a crutch that can disable your ability to maintain the program long-term.
The first C++ compilers were code generators that spit out C code (CFront).
I'm not sure if this is an argument for or against code generators.
I think that Mitchel has hit it on the head.
Code generation has its place. There are some circumstances where it's more effective to have the computer do the work for you!
It can give you the freedom to change your mind about the implementation of a particular component when the time cost of making the code changes is small. Of course, it is still probably important to understand the output the code generator, but not always.
We had an example on a project we just finished where a number of C++ apps needed to communicate with a C# app over named pipes. It was better for us to use small, simple, files that defined the messages and have all the classes and code generated for each side of the transaction. When a programmer was working on problem X, the last thing they needed was to worry about the implentation details of the messages and the inevitable cache hit that would entail.
This is a workflow question. ASP.NET is a code generator. The XAML parsing engine actually generates C# before it gets converted to MSIL. When a code generator becomes an external product like CodeSmith that is isolated from your development workflow, special care must be taken to keep your project in sync. For example, if the generated code is ORM output, and you make a change to the database schema, you will either have to either completely abandon the code generator or else take advantage of C#'s capacity to work with partial classes (which let you add members and functionality to an existing class without inheriting it).
I personally dislike the isolated / Alt-Tab nature of generator workflows; if the code generator is not part of my IDE then I feel like it's a kludge. Some code generators, such as Entity Spaces 2009 (not yet released), are more integrated than previous generations of generators.
I think the panacea to the purpose of code generators can be enjoyed in precompilation routines. C# and other .NET languages lack this, although ASP.NET enjoys it and that's why, say, SubSonic works so well for ASP.NET but not much else. SubSonic generates C# code at build-time just before the normal ASP.NET compilation kicks in.
Ask your tools vendor (i.e. Microsoft) to support pre-build routines more thoroughly, so that code generators can be integrated into the workflow of your solutions using metadata, rather than manually managed as externally outputted code files that have to be maintained in isolation.
Jon
The best application of a code generator is when the entire project is a model, and all the project's source code is generated from that model. I am not talking UML and related crap. In this case, the project model also contains custom code.
Then the only thing developers have to care about is the model. A simple architectural change may result in instant modification of thousands of source code lines. But everything remains in sync.
This is IMHO the best approach. Sound utopic? At least I know it's not ;) The near future will tell.
In a recent project we built our own code generator. We generated all the data base stuff, and all the base code for our view and view controller classes. Although the generator took several months to build (mostly because this was the first time we had done this, and we had a couple of false starts) it paid for itself the first time we ran it and generated the basic framework for the whole app in about ten minutes.
This was all in Java, but Ruby makes an excellent code-writing language particularly for small, one-off type projects.
The best thing was the consistency of the code and the project organization. In addition you kind of have to think the basic framework out ahead of time, which is always good.
Code generators are great assuming it is a good code generator. Especially working c++/java which is very verbose.
If you take over a project from someone to do simple updates do you follow their naming convention? I just received a project where the previous programmer used Hungarian Notation everywhere. Our core product has a naming standard, but we've had a lot of people do custom reporting over the years and do whatever they felt like.
I do not have time to change all of the variable names already in the code though.
I'm inclined for readablity just to continue with their naming convention.
Yes, I do. It makes it easier to follow by the people who inherit it after you. I do try and clean up the code a little to make it more readable if it's really difficult to understand.
I agree suggest that leaving the code as the author wrote it is fine as long as that code is internally consistent. If the code is difficult to follow because of inconsistency, you have a responsibility to the future maintainer (probably you) to make it clearer.
If you spend 40 hours figuring out what a function does because it uses poorly named variables, etc., you should refactor/rename for clarity/add commentary/do whatever is appropriate for the situation.
That said, if the only issue is that the mostly consistent style that the author used is different from the company standard or what you're used to, I think you're wasting your time renaming everything. Also, you may loose a source of expertise if the original author is still available for questions because he won't recognize the code anymore.
If you're not changing all the existing code to your standard, then I'd say stick with the original conventions as long as you're changing those files. Mixing two styles of code in the same file is destroying any benefit that a consistent code style would have, and the next guy would have to constantly ask himself "who wrote this function, what's it going to be called - FooBar() or fooBar()?"
This kind of thing gets even trickier when you're importing 3rd party libraries - you don't want to rewrite them, but their code style might not match yours. So in the end, you'll end up with several different naming conventions, and it's best to draw clear lines between "our code" and "their code".
Often, making a wholesale change to a codebase just to conform with the style guide is just a way to introduce new bugs with little added value.
This means that either you should:
Update the code you're working on to conform to the guideline as you work on it.
Use the conventions in the code to aide future maintenance efforts.
I'd recommend 2., but Hungarian Notation makes my eyes bleed :p.
If you are maintaining code that others wrote and that other people are going to maintain after you, you owe it to everybody involved not to make gratuitous changes. When they go into the source code control system to see what you changed, they should see what was necessary to fix the problem you were working on, and not a million diffs because you did a bunch of global searches and replaces or reformatted the code to fit your favourite brace matching convention.
Of course, if the original code really sucks, all bets are off.
Generally, yes, I'd go for convention and readability over standards in this scenario. No one likes that answer, but it's the right thing to do to keep the code maintainable long-term.
When a good programmer's reading code, he should be able to parse the variable names and keep track of several in his head -- as long as their consistent, at least within the source file. But if you break that consistency, it will likely force the programmer reading the code to suffer some cognitive dissonance, which would then make it a bit harder to keep track of. It's not a killer -- good programmers will get through it, but they'll curse your name and probably post you on TheDailyWTF.
I certainly would continue to use the same naming convention, as it'll keep the code consistent (even if it is consistently ugly) and more readable than mixing variable naming conventions. Human brains seem to be rather good at pattern recognition and you don't really want to throw the brain a curveball by gratuitously breaking said pattern.
That said, I'm anything but a few of Hungarian Notation but if that's what you've got to work with...
If the file or project is already written using a consistent style then you should try to follow that style, even if it conflicts/contradicts your existing style. One of the main goals of a code style is consistency, so if you introduce a different style in to code that is already consistent (within itself) you loose that consistency.
If the code is poorly written and requires some level of cleanup in order to understand it then cleaning up the style becomes a more relevant option, but you should only do so if absolutely necessary (especially if there are no unit tests) as you run the possiblity of introducing unexpected breaking changes.
Absolutely, yes. The one case where I don't believe it's preferable to follow the original programmer's naming convention is when the original programmer (or subsequent devs who've modified the code since then) failed to follow any consistent naming convention.
Yes. I actually wrote this up in a standards doc. I created at my current company:
Existing code supersedes all other standards and practices (whether they are industry-wide standards or those found in this document). In most cases, you should chameleon your code to match the existing code in the same files, for two reasons:
To avoid having multiple, distinct styles/patterns within a single module/file (which contradict the purpose of standards and hamper maintainability).
The effort of refactoring existing code is prone to being unnecessarily more costly (time-consuming and conducive to introduction of new bugs).
Personally whenever I take over a project that has a different variable naming scheme I tend to keep the same scheme that was being used by the previous programmer. The only thing I do different is for any new variables I add, I put an underscore before the variable name. This way I can quickly see my variables and my code without having to go into the source history and comparing versions. But when it comes to me inheriting simply unreadable code or comments I will usually go through them and clean them up as best I can without re-writing the whole thing (It has come to that). Organization is key to having extensible code!
if I can read the code, I (try) to take the same conventions
if it's not readable anyway I need to refactor and thus changing it (depending on what its like) considerable
Depends. If I'm building a new app and stealing the code from a legacy app with crap variable naming, I'll refactor once I get it into my app.
Yes..
There is litte that's more frustrating then walking into an application that has two drasticly different styles. One project I reciently worked on had two different ways of manipulating files, two different ways to implement screens, two different fundimental structures. The second coder even went so far as to make the new features part of a dll that gets called from the main code. Maintence was nightmarish and I had to learn both paradigms and hope when I was in one section I was working with the right one.
When in Rome do as the Romans do.
(Except for index variables names, e.g. "iArrayIndex++". Stop condoning that idiocy.)
I think of making a bug fix as a surgical procedure. Get in, disturb as little as possible, fix it, get out, leave as little trace of your being there as possible.
I do, but unfortunately, there where several developers before me that did not live to this rule, so I have several naming conventions to choose from.
But sometimes we get the time to set things straight so in the end, it will be nice and clean.
If the code already has a consistent style, including naming, I try to follow it. If previous programmers were not consistent, then I feel free to apply the company standard, or my personal standards if there is not any company standard.
In either case I try to mark the changes I have made by framing them with comments. I know with todays CVS systems this is often not done, but I still prefer to do it.
Unfortunately, most of the time the answer is yes. Most of the time, the code does not follow good conventions so it's hard to follow the precedent. But for readability, it's sometimes necessary to go with the flow.
However, if it's a small enough of an application that I can refactor a lot of the existing code to "smell" better, then I'll do so. Or, if this is part of a larger re-write, I'll also begin coding with the current coding standards. But this is not usually the case.
If there's a standard in the existing app, I think it's best to follow it. If there is no standard (tabs and spaces mixed, braces everywhere... oh the horror), then I do what I feel is best and generally run the existing code through a formatting tool (like Vim). I'll always keep the capitalization style, etc of the existing code if there is a coherent style.
My one exception to this rule is that I will not use hungarian notation unless someone has a gun to my head. I won't take the time to rename existing stuff, but anything I add new isn't going to have any hungarian warts on it.