Is it a good practice to follow optimization techniques during initial coding itself or should one concentrate purely on realization of functionality first?
If one concentrates purely on functionality during initial coding, then how easy or difficult is it to take care of optimization later on?
Optimise your design and architecture - don't lock yourself into a design which will never scale - but don't micro-optimise your implementation. In particular, don't sacrifice simplicity and readability for micro-optimised implementation... at least not without benchmarking your code (ideally your whole system) first.
Measurement really is the key point when it comes to performance. Bottlenecks are almost never where you expect them to be. There are loads of different ways of measuring; optimisation without any measurement is futile IMO.
Donald Knuth said:
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil
It depends what you see as "optimization". Micro-optimization should not be done in early stages, and afterwards only if you have a valid reason to do so (e.g. profiler results or similar).
However, writing well-structured, clean code following best practices and common coding guidelines is a good habit, and once you're used to it, it doesn't take much more time than writing sloppy code. This kind of "optimization" (not the correct word for it, but some see it as such) should be done from the beginning.
See http://en.wikipedia.org/wiki/Program_optimization for quotes by Knuth.
If you believe that optimization might make your code harder to (a) get right in the first place, or (b) maintain in the long run, then it's probably best to get it right first. Having good development processes, such as Test Driven Development, can help you make optimisations later.
It's always better to have it work right and slow, than wrong and fast.
Rightly said by Donald Knuth "Premature optimization is the the root of all evil " , and it makes your coding speed slow. The best way to optimize is by visiting the codebase again and refactoring. This way you know which part of the code is often used or is a bottleneck and should be fine tuned.
Premature optimization is not a good thing.
And that goes especially for low level optimization. But at a higher level your design shouldn't lock out any future optimization.
For example.
The retrieval of collections should be hidden behind methods call, in the end you can always decide to cache the retrieval of collections or not.
After you have a stable application and(!) you have developed regression unit tests. You can profile the application and optimize the hotspots. And remember to after every optimalization step you should run your complete unit test set.
Is it a good practice to follow optimization techniques during initial coding itself or should one concentrate purely on realization of functionality first?
If you know performance is critical (or important), consider it in your design and write it correctly the first time. If you don't also consider this in your design and it is important, you are wasting time or "developing a proof of concept".
Part of this comes down to experience; If you know optimizations and your program's problem areas or have already implemented similar functionalities in the past, your experience will certainly help you create an implementation closer to the end result the first time. If you still need a proof of concept, you should not be writing the actual program until that's completed -- kick out some tests to determine what solution is appropriate for the problem, then implement it properly.
If one concentrates purely on functionality during initial coding, then how easy or difficult is it to take care of optimization later on?
Some fixes are quick, others deserve complete rewrites. The more that needs to change and adapt after the fact, the more time you waste re-testing and maintaining a poorly implemented program. The libraries that are easiest to maintain and sustain the demands are typically the ones which the engineer had an understanding of what design is ideal, and strived to meet that ideal during initial implementation.
Of course, that also assumes you favor a long-lived program!
Premature optimization is the the root of all evil
To elaborate more on this famous quote, doing optimization early has the disadvantage of distracting you from doing a good design. Also, programmers are notoriously bad in finding which parts of the code cause the more trouble, and so try hard to optimize things that aren't that important. You should always measure first to find out what needs to be optimized and this can only happen in later phases.
Related
In advance apologize if the question seems somewhat broad or strange, I don't mean to offend anyone, but maybe someone can actually make a recommendation. I tried looking for the similar questions, but cold not.
Which are the better resources (books, blogs etc.) that can teach about optimizing code?
There is quite a few resources on making code more human-readable (Code Complete being number one choice probably). But what about making it run faster, more memory-efficient?
Of course there are lots of books on each particular language, but I wonder if there are some that cover the problems of memory / speed of operations and are somewhat language-independent?
Here are some links that might be helpful in general on the subject of memory optimizations
What Every Programmer Should Know About Memory by Ulrich Drepper
Herb Sutter: The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software
Slides: Herb Sutter: Machine Architecture (Things Your Programming Language Never Told You)
Video: Herb Sutter # NWCPP: Machine Architecture: Things Your Programming Language Never Told You
The microarchitecture of Intel, AMD and VIA CPUs
An optimization guide for assembly programmers and compiler makers, by Agner Fog
Read Structured Programming with go to Statements. While it's the source of the "premature optimisation is the source of all evil" quote that comes up the moment somebody wants to make anything faster or smaller - no matter how desperately important or late in the process they are - it's actually about the importance of making things efficient when you can.
Learn about time complexity, space complexity and the analysis of algorithms.
Come up with examples where you would want to sacrifice having worse space complexity for better time complexity, and vice versa.
Know the time and space complexities of the algorithms and data structures your languages and frameworks of choice offer, especially those you use most often.
Read the answers on this site on questions about creating a good hash code.
Study the approach HTTP took to having the advantage of caching, without the disadvantage of using stale data inappropriately. Consider how easy or difficult that is to apply to in-memory caches. Consider when you would say "screw it, I can live with being stale for the speed boost it gives me". Consider when you would say "screw it, I can live with being slow for the guarantee of freshness it gives me".
Learn how to multithread. Learn when it improves performance. Learn why it often doesn't or even makes things worse.
Look at a lot of Joe Duffy's blog where performance is a regular concern of his writing.
Learn how to process items as streams or iterations rather than building and rebuilding data-structures full of each item, each time. Learn when you're actually better off not doing that.
Know what things cost. You can't reasonably decide "I'll work so this is in the CPU cache rather than main-memory/main-memory rather than disk/disk rather than over a network" unless you've a good idea what actually causes each to be hit, and what the cost differences are. Worse, you can't dismiss something as premature optimisation if you don't know what they cost - not bothering to optimise something is often the best choice, but if you don't even consider it in passing you aren't "avoiding premature optimisation", you're muddling through and hoping it works.
Learn a bit about what optimisations are done for you by the script engine/jitter/compiler/etc you use. Learn how to work with them rather than against them. Learn not to re-do work it'll do for you anyway. In one or two cases, you may also be able to apply the same general principle to your work.
Search for cases on this site where something is dismissed as an implementation detail - yes, all of those are cases where the detail in question isn't the most important thing at the time, but all of those implementation details were chosen for a reason. Learn what they were. Learn the counter-arguments.
Edit (I'll keep adding a few more to this as I go):
Different books of course differ in the emphasis they put on efficiency concerns, but I remember Stroustrup's The C++ Programming Language as one where there were a good few times where he will explain a choice between a few different options as relating to efficiency, and also on how to not have decisions made for efficiency's sake impact on the usability of the classes "from the outside".
Which brings me to another point. Concentrate on the efficiency of the library code you reuse in different projects. You don't want to ever be thinking "maybe I should hand-roll a new one here to be more efficient", unless it's a very specialised case, you want to be confident that lots of work went into making that heavily used class efficient over a lot of case, and concentrate on identifying hot-spots.
As for specialised cases, some of the more obscure data structures are worth knowing for the cases they serve. For example, a DAWG is a very compact structure for storing strings with a lot of common prefixes and suffixes (which would be most words in most natural languages) where you just want to find those in the list that match a pattern. If you need a "payload" then a tree where each letter has a list of nodes for each subsequent letter (a generalisation of a DAWG but ending in that "payload" rather than the terminal node) has some but not all of the advantages. They also find the result in O(n) time where n is the length of the string sought.
How often will that come up? Not many. It came up for me once (a few times really, but they were variants of the same case), and as such it would not have been worth it for me to learn all there was to know about DAWGs until then. But I knew enough to know it was what I needed to research later, and it saved me gigabytes (really, from way too much for a machine with 16GB RAM to cope with, to less than 1.5GB). Going straight for a hand-rolled DAWG would totally be premature optimisation rather than putting the strings in a hashset, but flicking through the NIST datastructure site meant I could when it came up.
Consider: "Finding a string in a DAWG is O(n)" "Finding a string in a Hashset is O(1)" Both of these statements is true, but the speed of the two tends to be comparable. Why? Because the DAWG is O(n) in terms of the length of the string, and effectively O(1) in terms of the size of the DAWG. The Hashset is O(1) in terms of the size of the hashset, but working out the hash is typically O(n) in terms of the length of the string, and equality checks are also O(n) in terms of that length. Both statements were correct, but they were thinking about a different n! You always need to know what n means in any discussion of time and space complexity - most often it'll be the size of the structure, but not always.
Don't forget constant effects: O(n²) is the same as O(1) for sufficiently low values of n! Remember that the likes of O(n²) translates as n²*k + n * k₁ + k₂, with the assumption that k₁ & k₂ are low enough and k and the k of another algorithm or structure we are comparing of are close enough, that they don't really matter and it's only n² that we care about. This isn't true all the time, and we can sometimes find that k, k₁ or k₂ are high enough that we end up in trouble. It's also not true when n is going to be so small as to make the difference in the constant costs of different approaches matter. Of course normally when n is small we don't have a big efficiency concern, but what if we are doing m operations on structures averaging n in size, and m is large. If we are choosing between an O(1) and a O(n²) approach, we are choosing between an O(m) and O(n²m) approach overall. It still seems like a no-brainer in favour of the former, but with a low n it essentially becomes a choice between two different O(m) approaches, and the constant factors are much more important.
Learn about lock-free multi-threading. Or perhaps don't. Personally, I've two pieces of my own code I use professionally that use all but the simplest lock-free techniques. One is based on well-known approaches and I wouldn't bother now (it's .NET code first written for .NET2.0 and the .NET4.0 library supplies a class that does the same thing). The other I first wrote for fun, and only used after that just-for-fun period had given me something reliable (and it still gets beaten by something in the 4.0 library for a lot of cases, but not for some others that I care about). I would hate to have to write something like it with a deadline and a client in mind.
All that said, if you're coding out of interest, the challenges involved are interesting and it's an enjoyable thing to work with when you've the freedom to give up on a failed plan that you don't get when you're doing something for a paying client, and you'll certainly learn a lot about efficiency concerns generally. (Take a look at https://github.com/hackcraft/Ariadne if you want to see some of what I've done with this).
A Case Study
Actually, that contains a relatively good example of some of the above principles. Take a look at the method that's currently at line 511 at https://github.com/hackcraft/Ariadne/blob/master/Collections/ThreadSafeDictionary.cs (where I joke in the comments about it being flame-bait for people quoting Dijkstra. Let's use it as a case-study:
This method was first written to use recursion, because it's a naturally recursive problem - after doing the operation on the current table, if there's a "next" table we want to do the exact same operation on that, and so on until there's no further table.
Recursion is almost always slower than iteration, for a few different methods. Should we make all recursive calls iterative? No, it's often not worth it, and recursion is a wonderful way to write code that is clear about what it's doing. Here though I apply the principle above that since this is a library that might be called where performance is crucial, particular effort should be extended on it.
The decision to try to improve its speed being made, the next thing I did was make measurements. I don't depend on "I know that iteration is faster than recursion, so it must be faster when changed to avoid recursion". That's just not true - a poorly written iterative version may not be as good as a well-written recursive version.
The next question is, just how to re-write it. I've a tested method that I know works and I'm going to replace it with a different version. I don't want to replace it with a version that doesn't work, obviously, so how to re-write while taking the most advantage out of what's already there?
Well, I know about tail-call elimination; an optimisation normally done by compilers that changes the way the stack is managed so that recursive functions end up with properties closer to those of iterative (it's still recursive from the perspective of the source code, but it's iterative in terms of how the compiled code actually uses the stack).
This gives me two things to think about: 1. Maybe the compiler is already doing this, in which case my extra work isn't going to do anything to help. 2. If the compiler isn't already doing this, I can take the same basic approach manually.
That decision made, I replaced all of the points where the method called itself, with a change to the one parameter that would be different for that next call, and then go back to the beginning. I.e. instead of having:
CurrentMethod(param0.next, param1, param2, /*...*/);
We have:
param0 = param0.next;
goto startOfMethod;
That being done, I measure again. Running through the entire unit tests for the class is now consistently 13% faster than before. If it were closer I'd have tried more detail measurements, but a consistent 13% on runs that includes code that doesn't even call this method is something I'm pretty happy with. (It also tells me that the compiler wasn't doing the same optimisation, or I wouldn't have gained anything).
Then I clean up the method to make more changes that make sense with the new code. Most of them let me take out the goto because goto is indeed nasty (and there's other places the same optimisation was done that aren't as obvious because the goto was refactored entirely). In some, I left it in, because 13% is worth breaking the no-goto rule to my mind!
So the above gives an example of:
Deciding where to concentrate optimisation effort (based on how often it might be hit and my inability to predict all uses of the library)
Using knowledge of general costs (recursion costs more than iteration, most of the time).
Measuring rather than depending on assuming the above always applies.
Learning from what compilers do.
Understanding that because of that I may not gain anything - maybe the compiler already did it for me.
Avoiding optimisations leading to unreadable code (refactoring out most of the gotos the first pass introduced).
Some of these are matters of opinion and style (the decision to leave in some goto would not be without controversy), and it's certainly okay to disagree with my decisions, but knowledge of the points raised so far in this post would make it an informed disagreement, rather than a knee-jerk one.
In addition to the resources mentioned in other answers, Michael Abrash's Graphics Programming Black Book is a great read for learning about optimization. While the specifics are a bit dated in places, it is still a great resource for learning about how to approach optimization.
Any time you want to optimize code it is absolutely essential to measure, measure, measure. One of the best ways to learn about optimization is by doing - take some code you want to optimize, learn how to use a profiler to measure its performance and then make changes and measure the results.
I am aware that this could be seen as subjective but this is definitely not my intention. I am always on the hunt for techniques that I may have never heard of that help improve both productivity and quality of software engineers.
In particular I am looking for tools, techniques, approaches, tips and tricks, best practices, etc. that helped you to improve both your productivity and quality as a software engineer. This is actually a process related question. So please don't answer with your opinion about which programming language is the best from your perspective.
I expect the answers to be subjective. But that is the beauty of it. Not everything works for everybody. We all have a different set of constraints that we operate under. Therefore it is unavoidable that we make different choices. If the answers are contradictory, that would be perfectly fine!
What were the techniques that were helpful to you specifically? How did they make a difference? What criteria do you use to come to that conclusion?
Ok, here for my subjective answer.
Standard approach for improving quality is unit testing, in my opinion. Of course you can still write crappy code that works and have unit tests that confirm that it works but at least you know that it works. Where unit tests really give you an advantage is when you want to make changes to your code or add additional features. Having unit tests guarantees that your code keeps working.
As for productivity and unit tests it depends whether you look at short-term or long-term productivity. Unit testing takes time so you are less productive writing actual functionality. In the long term I'm absolutely sure you are more productive since your unit tests guarantee that during maintenance all functionality keeps working.
Second productivity and quality enhancing tip is to think each new feature thoroughly through. Once a new feature is shipped, you have to maintain it. Maintenance takes time and decreases productivity. Is the new feature necessary? How many customers actually want this feature? Always try to look at the bigger picture, what is your own vision for your product, does the new feature fit in with this vision.
The less code you have, the less code you have to maintain and the less bugs you have. I always try to keep that in mind.
Delegate to the appropriate party. Then review the assigned task periodically.
Once you know who best to delegate to, who will get tasks done correctly and efficiently, it improves your productivity massively.
It lets you focus on what you want or need to be doing.
Like answering stackoverflow questions for example.
Learning how to use recursion effectively. It's given me a framework to decompose complex problems into understandable code. It's helped me to code tough bits faster with no or very few errors. The book that taught me how to think this way was The Little Schemer by DP Friedman.
The second I'd say was learning Lisp. It's helped me to learn other languages more quickly. I can now classify other languages by the subset of Lisp functionality that they implement.
P.S. I don't use recursion frequently in my software. Most modern languages and frameworks have features and utility functions that allow you to avoid using recursion for simpler problems.
Actually I am making a major project in implementing compiler optimization techniques. I already know about the existing techniques, but I am confused what technique to choose and how to implement it.
G'day,
What area of optimization are you talking about?
Compiler optimizations such as:
loop optimizations
dataflow optimizations
static single assignment based optimizations
code generator optimizations
etc.
etc.
Or optimization in the performance of the compiler itself, i.e. the speed with which it works?
Assuming that you have a compiler to optimize, and if it wasn't written by you, look up the documentation to see what is missing. Otherwise, if it was written by you, you can start off with the simplest. The definition for the simplest will depend on the language your compiler consumes. Or am I missing something?
I think you may have over optimized your question . Are you trying to decide where to start or trying to decide if some optimizations are worth implementing and others are not? I would assume all of the existing techniques have a place and are useful depending on the code they come across. If you are deciding which one to do first, pick the one you can do and do it. Pick the low hanging fruit. Get a few wins in your back pocket before you tackle a tough one and stumble and get frustrated. I would assume the real trick is having all the optimizations there and working but coming up with a way to decide which ones produce something better for a particular program and which ones get in the way and make things worse.
IMHO, the thing to do is implement the simple, obvious optimizations and then let it rest. Certainly it is very interesting to try to do weird and wonderful optimizations to rectify things that the user could simply have coded a little better, but if you really want to try to clean up after poor coding or poor design, the user can always outrun you. This is my favorite example.
My favorite example of compiler-optimizations-gone-nuts is Fortran compilers, where they go to such lengths to scramble code to shave a few hypothetical cycles that the code is almost impossible to debug, and typically the program counter is in there less than 1% of the time, so the effort is wasted.
I read the latest coding horror post, and one of the comments touched a nerve for me:
This is the type of situation that test driven design/refactoring are supposed to fix. If (big if) you have tests for the interfaces, rewriting the implementation is risk-free, because you will know whether you caught everything.
Now in theory I like the idea of test driven development, but all the times I've tried to make it work, it hasn't gone particularly well, I get out of the habit, and next thing I know all the tests that I had originally written not only don't pass, but they're no longer a reflection of the design of the system.
It's all well and good if you've been handed a perfect design from on high, straight from the start (which in my experience never actually happens), but what if halfway through the production of a system you notice that there's a critical flaw in the design? Then it's no longer a simple matter of diving in and fixing "the bug", but you also have to rewrite all the tests. A fundamental assumption was wrong, and now you have to change it. Now test driven development is no longer a handy thing, but it just means that there's twice as much work to do everything.
I've tried to ask this question before, both of peers, and online, but I've never heard a very satisfactory answer. ... Oh wait.. what was the question?
How do you combine test driven development with a design that has to change to reflect a growing understanding of the problem space? How do you make the TDD practice work for you instead of against you?
Update:
I still don't think I fully understand it all, so I can't really make a decision about which answer to accept. Most of my leaps in understanding have happened in the comments sections, not in the answers. Here' s a collection of my favorites so far:
"Anyone who uses terms like "risk-free"
in software development is indeed full
of shit. But don't write off TDD just
because some of its proponents are
hyper-susceptible to hype. I find it
helps me clarify my thinking before
writing a chunk of code, helps me to
reproduce bugs and fix them, and makes
me more confident about refactoring
things when they start to look ugly"
-Kristopher Johnson
"In that case, you rewrite the tests
for just the portions of the interface
that have changed, and consider
yourself lucky to have good test
coverage elsewhere that will tell you
what other objects depend on it."
-rcoder
"In TDD, the reason to write the tests
is to do design. The reason to make
the tests automated is so that you can
reuse them as the design and code
evolve. When a test breaks, it means
you've somehow violated an earlier
design decision. Maybe that's a
decision you want to change, but it's
good to get that feedback as soon as
possible."
-Kristopher Johnson
[about testing interfaces] "A test would insert some elements,
check that the size corresponds to the
number of elements inserted, check
that contains() returns true for them
but not for things that weren't
inserted, checks that remove() works,
etc. All of these tests would be
identical for all implementations, and
of course you would run the same code
for each implementation and not copy
it. So when the interface changes,
you'd only have to adjust the test
code once, not once for each
implementation."
–Michael Borgwardt
One of the practices of TDD is the use of Baby Steps (which could be very boring in the beggining) which is the use of really small steps in order for you to understand your problem space and make a good and satisfactory solution for your problem.
If you already know the design of your application you aren't doing TDD at all. We should design it while doing your tests.
So the suggestion I would give is for you to concentrate on the baby steps in order to get a proper testable design
I don't think any real practitioner of TDD will claim that it completely eliminates the possibility of error or regression.
Remember that TDD is fundamentally about design, not about testing or quality control. Saying "all my tests pass" does not mean "I'm finished."
If your requirements or high-level design change drastically, then you may need to throw away all your tests along with all the code. That's just how things are sometimes. It doesn't mean that TDD isn't helping you.
Properly applied, TDD should actually make your life a lot easier in the face of changing requirements.
In my experience, code that is easy to test is code that is orthogonal from other subsystems, and which has clearly defined interfaces. Given such a starting point, it is much easier to rewrite significant portions of your application, since you can work with confidence knowing that a) your changes will be isolated to a few subsystems, and b) any breakage will quickly show up as failing tests.
If, on the other hand, you're just slapping unit tests on your code after it has been designed, then you may well have problems when requirements change. There's a difference between tests that fail quickly when subsystems change (because they're effectively flagging regressions) and those that are brittle, because they depend on too many unrelated pieces of system state. The former should be fixable by a few lines of code, while the latter may leave you scratching your head for hours trying to unravel them.
The only true answer is it depends.
There are ways to do TDD wrong, such
that it doesn't fit in with your
environment and eats effort with
minimal benefit.
There are ways to do TDD right, such
that it both cuts costs and increases
quality.
There are ways to something
similar-but-different to TDD, which
may or may not get called TDD, and
may or may not be more appropriate in
your particular situation.
It's a strange quirk of the market for software tools and experts that, to maximise the revenue for those pushing them, they are always written as if they somehow apply to 'all software'.
Truth is, 'software' is every bit as diverse as 'hardware', and nobody would think of buying a book on bridge-making to design an electronic gadget or build a garden shed.
I think you have some misconceptions about TDD. For a good explanation and example of what it is and how to use it, I recommend reading Kent Beck's Test-Driven Development: By Example.
Here are a few further comments that may help you understand what TDD is and why some people swear by it:
"How do you combine test driven development with a design that has to change to reflect a growing understanding of the problem space?"
TDD is a technique for exploring a problem space and creating and evolving a design that meets your needs. TDD is not something you do in addition to doing design; it is doing design.
"How do you make the TDD practice work for you instead of against you?"
TDD is not "twice as much work" as not doing TDD. Yes, you'll write a lot of tests, but that doesn't really take much time, and the effort isn't wasted. You have to test your code somehow, right? Running automated tests are a lot quicker than manually testing whenever you change something.
A lot of TDD tutorials present highly detailed tests of every method of every class. In real life, people don't do this. It is silly to write a test for every setter, every getter, and so on. The Beck book does a good job of showing how to use TDD to quickly design and implement something, slowing down to "baby steps" only when things get tricky. See How Deep Are Your Unit Tests for more on this point.
TDD is not about regression testing. TDD is about thinking before you write code. But having regression tests is a nice side benefit. They don't guarantee that code will never break, but they help a lot.
When you make changes that cause tests to break, that's not a bad thing; it's valuable feedback. Designs do change, and your tests aren't written in stone. If your design has changed so much that some tests are no longer valid, then just throw them away. Write the new tests you need to be confident about the new design.
it's no longer a simple matter of
diving in and fixing "the bug", but
you also have to rewrite all the
tests.
A fundamental creed of TDD is to avoid duplication both in the production code AND in the test code. If a single design change means you have to rewrite everything, you weren't doing TDD (or not doing it correctly at all).
Ideally, in a well-designed system with proper separation of concerns, design changes are local, just like implementation changes. While the real world is rarely ideal, you still usually get something in between: you have to change some of the production code and some of the tests, but not everything, and the changes are mostly simple and may even be done automatically by refactoring tools.
Coding something without knowing what will work best in the UI, while at the same time writing unittests. That is very time consuming. It's better to start out making some prototypes of the GUI to get the interaction right.. and then rewrite it with unittests (if you employer allows you).
Continuous Integration (CI) is one key. If your tests run automatically every time you check in to source control (and everyone else sees it if they fail), it's easier to avoid "stale" tests and stay in the green.
As Mr. Dias mentioned, Baby Steps are important. You make a small refactoring, you run your tests. If tests break, you immediately determine if this is expected (design change) or a failed refactoring. When tests are truly independent (comes with practice), this is seldom very difficult. Evolve your design slowly.
See also http://thought-tracker.blogspot.com/2005/11/notes-on-pragmatic-unit-testing.html - and definitely buy the book!
EDIT: Perhaps I'm looking at this the wrong way. Say you had a legacy codebase that you wanted to redesign. The first thing I would try to do is add tests for the current behavior. Refactoring without tests is risky - you might change behavior. After that, I would start to clean up the design, in small steps, running my unit tests after each step. That would give me confidence that my changes weren't breaking anything.
At some point the API might change. This would be a breaking change - clients would have to be updated. The tests would tell me this - which is good, because I'd have to update any existing clients (including the tests).
Now that's not TDD. But the idea is the same - the tests are specifications of behavior (yes, I'm shading into BDD), and they give me the confidence to refactor the implementation while insuring that I preserve the behavior (as well as letting me know when I change the interface).
In practice, I've found TDD gives me immediate feedback on poor interface design. I'm my first client - I know when my API is hard to use.
We tend to do much less design up front with TDD, knowing it can change. I have taken projects through huge gyrations (it's a web app, no it's a RESTful server, no it's a bot). The tests provide me with the ability to refactor and restructure and evolve your code much more easily than untested code. Although it seems contradictory, it is true-- even though you have more code, you are able to make major changes and have confidence that nothing has broken in the existing functionality.
I understand your concern that fundamental assumptions changing make you throw out tests. This seems intuitive, but I personally haven't seen it. Some tests go, but most are still valid-- often a major change isn't as major as it seems at first. Plus, as you get better at writing tests, you tend to write less brittle ones, which helps.
I tend to do a lot of projects on short deadlines and with lots of code that will never be used again, so there's always pressure/temptation to cut corners. One rule I always stick to is encapsulation/loose coupling, so I have lots of small classes rather than one giant God class. But what else should I never compromise on?
Update - thanks for the great response. Lots of people have suggested unit testing, but I don't think that's really appropriate to the kind of UI coding I do. Usability / User acceptance testing seems much important. To reiterate, I'm talking about the BARE MINIMUM of coding standards for impossible deadline projects.
Not OOP, but a practice that helps in both the short and long run is DRY, Don't Repeat Yourself. Don't use copy/paste inheritance.
Not a OOP practice, but common sense ;-).
If you are in a hurry, and have to write a hack. Always add a piece of comment with the reasons. So you can trace it back and make a good solution later.
If you never had the time to come back, you always have the comment so you know, why the solution was chosen at the moment.
Use Source control.
No matter how long it takes to set up (seconds..), it will always make your life easier! (still it's not OOP related).
Naming. Under pressure you'll write horrible code that you won't have time to document or even comment. Naming variables, methods and classes as explicitly as possible takes almost no additional time and will make the mess readable when you must fix it. From an OOP point of view, using nouns for classes and verbs for methods naturally helps encapsulation and modularity.
Unit tests - helps you sleep at night :-)
This is rather obvious (I hope), but at the very least I always make sure my public interface is as correct as possible. The internals of a class can always be refactored later on.
no public class with mutable public variables (struct-like).
Before you know it, you refer to this public variable all over your code, and the day you decide this field is a computed one and must have some logic in it... the refactoring gets messy.
If that day is before your release date, it gets messier.
Think about the people (may even be your future self) who have to read and understand the code at some point.
Application of the single responsibility principal. Effectively applying this principal generates a lot of positive externalities.
Like everyone else, not as much OOP practices, as much as practices for coding that apply to OOP.
Unit test, unit test, unit test. Defined unit tests have a habit of keeping people on task and not "wandering" aimlessly between objects.
Define and document all hierarchical information (namespaces, packages, folder structures, etc.) prior to writing production code. This helps to flesh out object relations and expose flaws in assumptions related to relationships of objects.
Define and document all applicable interfaces prior to writing production code. If done by a lead or an architect, this practice can additionally help keep more junior-level developers on task.
There are probably countless other "shoulds", but if I had to pick my top three, that would be the list.
Edit in response to comment:
This is precisely why you need to do these things up front. All of these sorts of practices make continued maintenance easier. As you assume more risk in the kickoff of a project, the more likely it is that you will spend more and more time maintaining the code. Granted, there is a larger upfront cost, but building on a solid foundation pays for itself. Is your obstacle lack of time (i.e. having to maintain other applications) or a decision from higher up? I have had to fight both of those fronts to be able to adopt these kinds of practices, and it isn't a pleasant situation to be in.
Of course everything should be Unit tested, well designed, commented, checked into source control and free of bugs. But life is not like that.
My personal ranking is this:
Use source control and actually write commit comments. This way you have a tiny bit of documentation should you ever wonder "what the heck did I think when I wrote this?"
Write clean code or document. Clean well-written code should need little documentation, as it's meaning can be grasped from reading it. Hacks are a lot different. Write why you did it, what you do and what you'd like to do if you had the time/knowledge/motivation/... to do it right
Unit Test. Yes it's down on number three. Not because it's unimportant but because it's useless if you don't have the other two at least halfway complete. Writing Unit tests is another level of documentation what you code should be doing (among others).
Refactor before you add something. This might sound like a typical "but we don't have time for it" point. But as with many of those points it usually saves more time than it costs. At least if you have at least some experience with it.
I'm aware that much of this has already been mentioned, but since it's a rather subjective matter, I wanted to add my ranking.
[insert boilerplate not-OOP specific caveat here]
Separation of concerns, unit tests, and that feeling that if something is too complex it's probably not conceptualised quite right yet.
UML sketching: this has clarified and saved any amount of wasted effort so many times. Pictures are great aren't they? :)
Really thinking about is-a's and has-a's. Getting this right first time is so important.
No matter how fast a company wants it, I pretty much always try to write code to the best of my ability.
I don't find it takes any longer and usually saves a lot of time, even in the short-term.
I've can't remember ever writing code and never looking at it again, I always make a few passes over it to test and debug it, and even in those few passes practices like refactoring to keep my code DRY, documentation (to some degree), separation of concerns and cohesion all seem to save time.
This includes crating many more small classes than most people (One concern per class, please) and often extracting initialization data into external files (or arrays) and writing little parsers for that data... Sometimes even writing little GUIs instead of editing data by hand.
Coding itself is pretty quick and easy, debugging crap someone wrote when they were "Under pressure" is what takes all the time!
At almost a year into my current project I finally set up an automated build that pushes any new commits to the test server, and man, I wish I had done that on day one. The biggest mistake I made early-on was going dark. With every feature, enhancement, bug-fix etc, I had a bad case of the "just one mores" before I would let anyone see the product, and it literally spiraled into a six month cycle. If every reasonable change had been automatically pushed out it would have been harder for me to hide, and I would have been more on-track with regard to the stakeholders' involvement.
Go back to code you wrote a few days/weeks ago and spend 20 minutes reviewing your own code. With the passage of time, you will be able to determine whether your "off-the-cuff" code is organized well enough for future maintenance efforts. While you're in there, look for refactoring and renaming opportunities.
I sometimes find that the name I chose for a function at the outset doesn't perfectly fit the function in its final form. With refactoring tools, you can easily change the name early before it goes into widespread use.
Just like everybody else has suggested these recommendations aren't specific to OOP:
Ensure that you comment your code and use sensibly named variables. If you ever have to look back upon the quick and dirty code you've written, you should be able to understand it easily. A general rule that I follow is; if you deleted all of the code and only had the comments left, you should still be able to understand the program flow.
Hacks usually tend to be convoluted and un-intuitive, so some good commenting is essential.
I'd also recommend that if you usually have to work to tight deadlines, get yourself a code library built up based upon your most common tasks. This will allow you to "join the dots" rather than reinvent the wheel each time you have a project.
Regards,
Docta
An actual OOP practice I always make time for is the Single Responsibility Principle, because it becomes so much harder to properly refactor the code later on when the project is "live".
By sticking to this principle I find that the code I write is easily re-used, replaced or rewritten if it fails to match the functional or non-functional requirements. When you end up with classes that have multiple responsibilities, some of them may fulfill the requirements, some may not, and the whole may be entirely unclear.
These kinds of classes are stressful to maintain because you are never sure what your "fix" will break.
For this special case (short deadlines and with lots of code that will never be used again) I suggest you to pay attention to embedding some script engine into your OOP code.
Learn to "refactor as-you-go". Mainly from an "extract method" standpoint. When you start to write a block of sequential code, take a few seconds to decide if this block could stand-alone as a reusable method and, if so, make that method immediately. I recommend it even for throw-away projects (especially if you can go back later and compile such methods into your personal toolbox API). It doesn't take long before you do it almost without thinking.
Hopefully you do this already and I'm preaching to the choir.