How to gauge the quality of a software product - testing

I have a product, X, which we deliver to a client, C every month, including bugfixes, enhancements, new development etc.) Each month, I am asked to err "guarantee" the quality of the product.
For this we use a number of statistics garnered from the tests that we do, such as:
reopen rate (number of bugs reopened/number of corrected bugs tested)
new bug rate (number of new, including regressions, bugs found during testing/number of corrected bugs tested)
for each new enhancement, the new bug rate (the number of bugs found for this enhancement/number of mandays)
and various other figures.
It is impossible, for reasons we shan't go into, to test everything every time.
So, my question is:
How do I estimate the number and type of bugs that remain in my software?
What testing strategies do I have to follow to make sure that the product is good?
I know this is a bit of an open question, but hey, I also know that there are no simple solutions.
Thanks.

I don't think you can ever really estimate the number of bugs in your app. Unless you use a language and process that allows formal proofs, you can never really be sure. Your time is probably better spent setting up processes to minimize bugs than trying to estimate how many you have.
One of the most important things you can do is have a good QA team and good work item tracking. You may not be able to do full regression testing every time, but if you have a list of the changes you've made to the app since the last release, then your QA people (or person) can focus their testing on the parts of the app that are expected to be affected.
Another thing that would be helpful is unit tests. The more of your codebase you have covered the more confident you can be that changes in one area didn't inadvertently affected another area. I've found this quite useful, as sometimes I'll change something and forget that it would affect another part of the app, and the unit tests showed the problem right away. Passed unit tests won't guarantee that you haven't broken anything, but they can help increase confidence that changes you make are working.
Also, this is a bit redundant and obvious, but make sure you have good bug tracking software. :)

The question is who requires you to provide the stats.
If it's non-technical people, fake the stats. By "fake", I mean "provide any inevitably meaningless, but real numbers" of the kind you mentioned.
If it's technical people without a CS background, they ought to be told about the halting problem, which is undecidable and is simpler than counting and classifying the remaining bugs.
There's a lot of metrics and tools regarding software quality (code coverage, cyclomatic complexity, coding guidelines and tools enforcing them, etc.). In practice, what works is automating as much tests as possible, having human testers do as many tests that weren't automated as possible, and then pray.

I think keeping it simple is the best way to go. Categorize your bugs by severity, and address them in order of decreasing severity.
This way you can hand over the highest-quality build possible (the number of significant bugs remaining is how I would gauge the quality of the product, as opposed to some complex statistics).

Most of the agile methodologies address this dilemma pretty clearly. You can't test everything. Neither can you test it infinite number of times before you release. So the procedure is to rely on the risk and likelihood of the bug. Both risk and likelihood are numerical values. The product of both gives you a RPN number. If the number is less than 15 you ship a beta. If you can bring it down to less than 10 you ship the product and push the bug to be fixed in a future releasee.
How to calculate risk ?
If its a crash then its a 5
If its a crash but you can provide a work around then its a number less than 5.
If the bug reduces the functionality then its a 4
How to calculate likelihood ?
can you re-produce it every time you run, its a 5.
If the work around provided still causes it to crash then less than 5
Well, I am curious to know whether anyone else using this scheme and eager to know their milage on this.

How long is a piece of string? Ultimately what makes a quality product? Bugs gives some indication yes, but many other factors are involved, Unit Test coverage is a key factor in IMO. But in my experience the main factor that effects whether a product can be deemed quality or not, is good understanding of the problem that is being solved. Often what happens is, the 'problem' that the product is meant to solve is not understood correctly and developers end up inventing the solution to a problem they have flesh out in their head, and not the real problem, thus 'bugs' are made. I am a strong proponent of iterative Agile development, that way the product is constantly access against the 'problem' and the product does not stray to far from its goal.

The questions I heard wer, how do I estimate the bugs in my software? and what techniques do I use to ensure the quality is good?
Rather than go through a full course, here are a couple approaches.
How do I estimate the bugs in my software?
Start with the history, you know how many you found during testing (hopefully) and you know how many were found after the fact. You can use that to estimate how efficient you are at finding bugs (DDR - Defect Detection Rate is one name for this). If you can show that for some consistent time period, your DDR is consistent (or improving) you can provide some insight into the quality of the release by guessing at the number of post-release defects that will be found once the product is released.
What techniques do I use to ensure the quality is good?
Root cause analysis on your bugs will point you to specific components that are buggy, specific developers that create buggy code, the fact that lacking full requirements results in implementation not matching expectations, etc.
Project Review meetings to quickly identify what was good, so those things can be repeated and what was bad and find a way to not do those again.
Hopefully, these give you a good start. Good Luck!

It seems the consensus is that the emphasis should be placed on unit testing. Bug tracking is a good indicator of the product quality, but is only is acurate as your test team. If you employ unit testing it gives you a measurable metric of code coverage and provides regression testing so you can be assured you didn't break anything since last month.
My company relies on system/integration level testing. I see alot of defects being introduced because there is a lack of regression testing. I think "bugs" where the developer's implementation of the requirements deviates from the user's vision is sort of a seperate problem that as Dan and rptony stated is best addressed by Agile methodologies.

Related

Software Metrics in Agile Methodologies [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Agile methodologies are rather prevalent these days, but I cannot seem to find much documentation on what metrics are most useful and why. I have found many more things saying that some traditional metrics like LOC and code coverage of tests are not appropriate, leaving two main questions:
Why are those two (and other) metrics inappropriate?
What metrics are best for Agile and why?
Even with an Agile process, wouldn't you want to know how much code coverage you have with your unit tests? Or is it simply that this metric (and others) just are not as useful as other metrics like cyclomatic complexity and velocity?
Agile is a business oriented thing, Agile is about maximizing the customer value while minimizing waste to provide the most optimal ROI. This is what should get measured. And to do so, I use the system that Mary Poppendieck recommends. This system is based on three holistic measurements that must be taken as a package:
Cycle time
From product concept to first release or
From feature request to feature deployment or
From bug detection to resolution
Business Case Realization (without this, everything else is irrelevant)
P&L or
ROI or
Goal of investment
Customer Satisfaction
e.g. Net Promoter Score
Sure, at the team level you can track things like test coverage, cyclomatic complexity, conformance to coding standards, etc, but high quality is not an end in itself, it's just a mean. Don't misinterpret me, I'm not saying high quality doesn't matters, high quality is mandatory to achieve sustainable pace (and we include "no increase of the technical debt" in our Definition of Done) but still, the goal is to deliver value to the customer in a fast and profitable way.
Irrespective of methodology, there are some basic metrics that can and should be used.
According to S. Kahn, the most important are the following three:
size of product
number of defects found in final phase of testing
and number of defects found in the field.
If those are all you track, there's at least five ways they can be used:
calculate product defect rate (A)
calculate test defect rate (B)
determine a desirable goal for A and monitor the performance
determine a desirable goal for B and monitor the performance
assess correlation between A and B
if correlation is found, form metric of test effectiveness (B/A * 100%)
Although not necessarily fun to read, Metrics and Models of Software Quality Engineering provides an excellent in-depth software engineering and metrics overview.
1.1) LOC are easy to answer
They are really dependent of the language you use! The same feature might have a big difference when written on JAVA or on Ruby, for example
A not well written software might have more lines than a good one!
1.2) Code coverage
IMHO you should use metric, although its not perfect, it should give you a nice understanding on where your code needs more tests.
Just one point you should take care here is that it is also dependent of the language. There could be some situations where you have a class or method that you really don't need to test! For example a class with only getters and setters.
2) From (1) you just mentioned code metrics, but judging from your question about velocity, you are interested on metrics on all the creation process, so I would list some:
Velocity: The classic one and, if used well, it can enhance quite well an agile team performance, since you will know what your team can really do on a fixed time.
Burn up and burn down charts : they can give you a good notion about how the team is performing during the interaction (sprint)
There are some articles on InfoQ about this. Here and here.
As for question 1, I don't see any reason those metrics would be bad in an Agile process.
LOC provides you with a relative size measurement. While it may not always be useful to compare numbers between projects, it can provide you with a rate of growth within the project. If you can get it, the number of lines changed within a sprint may be useful as well to track a rate or refactoring.
Code coverage (of lines of code) gives you a general sense of whether or not your team is meeting a minimum bar of automated testing within a project.
As for question 2, keep the items above and here are a few more:
LOC versus test count. If you can, maintain separate ratios for unit, integration and system tests.
Average number of acceptance criteria versus test scenarios (or tests) for each story. It can help provide a better sense of whether or not your testing against the story's intent.
Number of defects discovered
Amount of work discovered (this is often captured by Agile tracking software) that wasn't part original estimates. It will help you judge if you are doing 'enough' planning.
Tracking consistencies, or lack thereof, of velocity sprint to sprint
While probably not popular and probably potentially dangerous, tracking estimates to work completed for each developer. While teams are supposed to be self organized and driven, not all teams are capable of dealing with human problems.
Just to add
Why LOC and Code Coverage of Tests are less than ideal:
Agile emphasizes outcome, not output (see Agile Manifesto). These two simply track output. Also, they do not properly measure refactoring, which is a vital aspect of Agile processes.
Another metric to consider would be Running Tested Features. I can't describe any better than this: http://xprogramming.com/articles/jatrtsmetric/
I'm going to answer to this very old question...
LOC and Test coverage are, in my opinion, good metrics, but they have one big problem: if you push them, you can make them grow fastly, but the result will be terryifing: tons of nonsense code, or in the test coverage, you can invoque all your code in a try-catch block and not write one single assert... Or even worse, just write one for "compliance" reasons, but without any business-facing or code-facing meaning...
So, these kind of metrics are very good if they help the team to honestly evaluate their outcome, but are an evil tool if they form part of some "compliance" rules, as using them in that way causes more harm (dead code, bad tests!) than what you originally wanted to achieve.
So, with every metric, think how you would trick it if you were forced to achieve a certain value, and think of the consequences... This is not an issue of LOC or test coverage, many other metrics can have similar outcome, even cyclomatic complexity... If you divide your code in a bad manner, you can reduce cyclomatic complexity, but it doesn't mean you get better or more readable code!
So, these kind of metrics are quite good to see what's happening inside a team, but any measure you take should be based on concrete goals, not on the metric itself... For example:
Test coverage is low: you implement coding dojos once a month to help train people to write testable code, you find out what code has the worst test coverage and try to implement a better / more testable architecture that helps / motivates developers to write test, etc.
As you can see, you never tell the team to achieve a certain value of test coverage, you just use the metric to see where you can improve and then look for measures that benefit your process, after a time you would expect test coverage to increase, but you are not pushing people to do so! You are evaluating changes in order to see if the measures are helping. If after a time you find out that test coverage has not changed with your measures, then it's time to look for other ideas, and so on...

Encouraging management to scrap manual tests and do things the proper way

I am working in a project which is quite complex in terms of size (it's to make a web app). The first problem is that nobody is interested in any products which could really solve the problems surrounding the project (lack of time, no adjustments in timescales in response to ever changing requirements). Bare in mind these products are not expensive ( < $500 for a company making millions) and not products which require a lot of configuration (though the project needs products like that, such as build automation tools, to free up time).
Anyway, this means that testing is all done manually as documentation is a deliverable - this means the actual technical design, implementation and testing of the site suffers (are we developers or document writers? What are we trying to do here? are questions which come to mind). The site is quite large and complex (not on the scale of Facebook or anything like that), but doing manual tests as instructed to do so (despite my warnings) tells me this is not high quality testing and therefore not a high quality product to come out of it.
What benefits can I suggest to the relevant people to encourage automated testing (which they know I can implement)? I know it is possible to change resolution via cmd with a 3rd party app for Windows, so this could all be part of an automated build. Instead, I will probably have to run through all these permutations of browsers, screen resolutions, and window sizes manually. Also, where do recorded tests fall down on? Do they break when windows are minimised? The big problem with this is that I am doing the work in monitoring the test and the PC is not doing ALL of the work, which is my job (make the pc do all the work). And given a lack of resources, this clogs up a dev box - yes, used for development and then by me for testing. Much better to automate this for a night run when the box is free.
Thanks
Talking about money is usually the best way to get management attention, so here are a few suggestions:
Estimate how long it takes you to do your current manual testing.
Get a list of critical bugs that were found by customers - ideally with an idea of the impact cost (fixing a bug after release is always much more expensive than before), but it's usually good enough just to describe one or two particularly bad bugs. Your manual testing didn't catch these customer bugs, so this is a good way to demonstrate that your manual testing is inadequate.
Come up with a pilot project where you automate testing a certain area of the product where bugs were found in production. Estimate the cost of the pilot project - doing a restricted pilot has the advantages of being easier to scope and estimate. Then compare the ongoing cost of repeatedly running the automation versus testing every release manually; after a few release you should break even on the cost of the automation tool plus the test development. Be careful picking the automation area - try to avoid areas like a complex UI that might change significantly between releases and thus require a lot of time to be spent on updating the automated tests.
Good luck to you. I screamed for all of this and I work for a billion+ company. We still perform manual testing (including regression testing). Automated tests are finally being instituted because some of the developers went out and got demos of some of the software you're describing and began configuring a framework.
Your best bet is to come up with an actual dollars and cents documented comparison between working with a product and working without a product to prove unequivocably to the management figures in charge of spending the money and designing the processes that the ROI is not only there but people who need to perform testing and/or change their existing processes will actually find their jobs a little bit easier.
Go grassroots. Talk to your team, get them on board. Talk to your business analysts, get them on board. Talk to any QA people you have and get them on board. When the villagers attack the castle with pitchforks and torches, you can bet that the wallets will open up and you'll be performing automated testing.
I would just try to automate as much as you can, whenever you can. I don't think you need to necessarily ask for permission to do things like this. Maybe your management doesn't think of these things, and often they won't see the benefit until you show them a great example.
Is it just that capital expenditures are difficult ? I've seen places where the time of existing employees is already spent, and therefore, essentially worthless in comparison to new purchases.
As for convincing managers, cost of manual regression tests versus cost to automate. If you are running lots of manual tests, this should be an easy win. If you aren't running the tests often, try for cost of a bug. However, in many companies, the cost for a bug isn't attributed to the development department, quality and the cost of bug may not be a strong motivation (in other words, quality is just about pride and ego, not actually what it costs).
Convincing developers...if they aren't already on board...electo-shock therapy ? If they aren't there, it's going to be an up hill battle.
Have been trying to similar on my current project... I can say there's another factor - time. There's a learning curve on automated tools and automated test development. The first release that is tested with automated tools will not be tested as quickly as it was manually, because the testers are learning the tools in addition to exercising tests. The second release will be much faster and every release after that will be faster still - but the first one will be a schedule hit, if not a cost hit.
The financial case is not too hard - over time, the project saves lots of money, as resources for repetitive testing are vastly reduced.
But the hard part to find a strategy that lets you get the tool into usage with a minimum of schedule drag on the first release that uses the test tool. Testing is always squashed at the end of the schedule, so it's the thing most sensitive to schedule stress. Anything you can do to show management how to reduce or remove the learning curve and automated test setup and installation time is likely to increase your chances of using the tool.

Profiling a VxWorks system

We've got a fairly large application running on VxWorks 5.5.1 that's been developed and modified for around 10 years now. We have some simple home-grown tools to show that we are not using too much memory or too much processor, but we don't have a good feel for how much headroom we actually have. It's starting to make it difficult to do estimates for future enhancements.
Does anybody have any suggestions on how to profile such a system? We've never had much luck getting the Wind River tools to work.
For bonus points: the other complication is that our system has very different behaviors at different times; during start-up it does a lot of stuff, then it sits relatively idle except for brief bursts of activity. If there is a profiler with some programmatic way to have to record state information, I think that'd be very useful too.
FWIW, this is compiled with GCC and written entirely in C.
I've done a lot of performance tuning of various kinds of software, including embedded applications. I won't discuss memory profiling - I think that is a different issue.
I can only guess where the "well-known" idea originated that to find performance problems you need to measure performance of various parts. That is a top-down approach, similar to the way governments try to control budget waste, by subdividing. IMHO, it doesn't work very well.
Measurement is OK for seeing if what you did made a difference, but it is poor at telling you what to fix.
What is good at telling you what to fix is a bottom-up approach, in which you examine a representative sample of microscopic units of what is being spent, and finding out the full explanation of why each one is being spent. This works for a simple statistical reason. If there is a reason why some percent (for example 40%) of samples can be saved, on average 40% of samples will show it, and it doesn't require a huge number of samples. It does require that you examine each sample carefully, and not just sort of aggregate them into bigger bunches.
As a historical example, this is what Harry Truman did at the outbreak of the U.S. involvement in WW II. There was terrific waste in the defense industry. He just got in his car, drove out to the factories, and interviewed the people standing around. Then he went back to the U.S. Senate, explained what the problems were exactly, and got them fixed.
Maybe this is more of an answer than you wanted. Specifically, this is the method I use, and this is a blow-by-blow example of it.
ADDED: I guess the idea of finding-by-measuring is simply natural. Around '82 I was working on an embedded system, and I needed to do some performance tuning. The hardware engineer offered to put a timer on the board that I could read (providing from his plenty). IOW he assumed that finding performance problems required timing. I thanked him and declined, because by that time I knew and trusted the random-halt technique (done with an in-circuit-emulator).
If you have the Auxiliary Clock available, you could use the SPY utility (configurable via the config.h file) which does give you a very rough approximation of which tasks are using the CPU.
The nice thing about it is that it does not require being attached to the Tornado environment and you can use it from the Kernel shell.
Otherwise, btpierre's suggestion of using taskHookAdd has been used successfully in the past.
I've worked on systems that have had luck using locally-built monitoring utilities based on taskSwitchHookAdd and related functions (delete hook, etc).
"Simply" use this to track the number of ticks a given task runs. I realize that this is fairly gross scale information for profiling, but it can be useful depending on your needs.
To see how much cpu% each task is using, calculate the percentage of ticks assigned to each task.
To see how much headroom you have, add a lowest priority "idle" task that just does "while(1){}", and see how much cpu% it is assigned to it. Roughly speaking, that's your headroom.

What are the most useful software development metrics? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I would like to track metrics that can be used to improve my team’s software development process, improve time estimates, and detect special case variations that need to be addressed during the project execution.
Please limit each answer to a single metric, describe how to use it, and vote up the good answers.
(source: osnews.com)
ROI.
The total amount of revenue brought in by the software minus the total amount of costs to produce the software. Breakdown the costs by percentage of total cost and isolate your poorest performing and most expensive area in terms of return-on-investment. Improve, automate, or eliminate that problem area if possible. Conversely, find your highest return-on-investment area and find ways to amplify its effects even further. If 80% of your ROI comes from 20% of your cost or effort, expand that particular area and minimize the rest by comparison.
Costs will include payroll, licenses, legal fees, hardware, office equipment, marketing, production, distribution, and support. This can be done on a macro level for a company as whole or a micro level for a team or individual. It can also be applied to time, tasks, and methods in addition to revenue.
This doesn't mean ignore all the details, but find a way to quantify everything and then concentrate on the areas that yield the best (objective) results.
Inverse code coverage
Get a percentage of code not executed during a test. This is similiar to what Shafa mentioned, but the usage is different. If a line of code is ran during testing then we know it might be tested. But if a line of code has not been ran then we know for sure that is has not been tested. Targeting these areas for unit testing will improve quality and takes less time than auditing the code that has been covered. Ideally you can do both, but that never seams to happen.
"improve my team’s software development process": Defect Find and Fix Rates
This relates to the number of defects or bugs raised against the number of fixes which have been committed or verified.
I'd have to say this is one of the really important metrics because it gives you two things:
1. Code churn. How much code is being changed on a daily/weekly basis (which is important when you are trying to stabilize for a release), and,
2. Shows you whether defects are ahead of fixes or vice-versa. This shows you how well the development team is responding to defects raised by the QA/testers.
A low fix rate indicates the team is busy working on other things (features perhaps). If the bug count is high, you might need to get developers to address some of the defects.
A low find rate indicates either your solution is brilliant and almost bug free, or the QA team have been blocked or have another focus.
Track how long is takes to do a task that has an estimate against it. If they were well under, question why. If they are well over, question why.
Don't make it a negative thing, it's fine if tasks blow out or were way under estimated. Your goal is to continually improve your estimation process.
Track the source and type of bugs that you find.
The bug source represents the phase of development in which the bug was introduced. (eg. specification, design, implementation etc.)
The bug type is the broad style of bug. eg. memory allocation, incorrect conditional.
This should allow you to alter the procedures you follow in that phase of development and to tune your coding style guide to try to eliminate over represented bug types.
Velocity: the number of features per given unit time.
Up to you to determine how you define features, but they should be roughly the same order of magnitude otherwise velocity is less useful. For instance, you may classify your features by stories or use cases. These should be broken down so that they are all roughly the same size. Every iteration, figure out how many stories (use-cases) got implemented (completed). The average number of features/iteration is your velocity. Once you know your velocity based on your feature unit you can use it to help estimate how long it will take to complete new projects based on their features.
[EDIT] Alternatively, you can assign a weight like function points or story points to each story as a measure of complexity, then add up the points for each completed feature and compute velocity in points/iteration.
Track the number of clones (similar code snippets) in the source code.
Get rid of clones by refactoring the code as soon as you spot the clones.
Average function length, or possibly a histogram of function lengths to get a better feel.
The longer a function is, the less obvious its correctness. If the code contains lots of long functions, it's probably a safe bet that there are a few bugs hiding in there.
number of failing tests or broken builds per commit.
interdependency between classes. how tightly your code is coupled.
Track whether a piece of source has undergone review and, if so, what type. And later, track the number of bugs found in reviewed vs. unreviewed code.
This will allow you to determine how effectively your code review process(es) are operating in terms of bugs found.
If you're using Scrum, the backlog. How big is it after each sprint? Is it shrinking at a consistent rate? Or is stuff being pushed into the backlog because of (a) stuff that wasn't thought of to begin with ("We need another use case for an audit report that no one thought of, I'll just add it to the backlog.") or (b) not getting stuff done and pushing it into the backlog to meet the date instead of the promised features.
http://cccc.sourceforge.net/
Fan in and Fan out are my favorites.
Fan in:
How many other modules/classes use/know this module
Fan out:
How many other modules does this module use/know
improve time estimates
While Joel Spolsky's Evidence-based Scheduling isn't per se a metric, it sounds like exactly what you want. See http://www.joelonsoftware.com/items/2007/10/26.html
I especially like and use the system that Mary Poppendieck recommends. This system is based on three holistic measurements that must be taken as a package (so no, I'm not going to provide 3 answers):
Cycle time
From product concept to first release or
From feature request to feature deployment or
From bug detection to resolution
Business Case Realization (without this, everything else is irrelevant)
P&L or
ROI or
Goal of investment
Customer Satisfaction
e.g. Net Promoter Score
I don't need more to know if we are in phase with the ultimate goal: providing value to users, and fast.
number of similar lines. (copy/pasted code)
improve my team’s software development process
It is important to understand that metrics can do nothing to improve your team’s software development process. All they can be used for is measuring how well you are advancing toward improving your development process in regards to the particular metric you are using. Perhaps I am quibbling over semantics but the way you are expressing it is why most developers hate it. It sounds like you are trying to use metrics to drive a result instead of using metrics to measure the result.
To put it another way, would you rather have 100% code coverage and lousy unit tests or fantastic unit tests and < 80% coverage?
Your answer should be the latter. You could even want the perfect world and have both but you better focus on the unit tests first and let the coverage get there when it does.
Most of the aforementioned metrics are interesting but won't help you improve team performance. Problem is your asking a management question in a development forum.
Here are a few metrics: Estimates/vs/actuals at the project schedule level and personal level (see previous link to Joel's Evidence-based method), % defects removed at release (see my blog: http://redrockresearch.org/?p=58), Scope creep/month, and overall productivity rating (Putnam's productivity index). Also, developers bandwidth is good to measure.
Every time a bug is reported by the QA team- analyze why that defect escaped unit-testing by the developers.
Consider this as a perpetual-self-improvement exercise.
I like Defect Resolution Efficiency metrics. DRE is ratio of defects resolved prior to software release against all defects found. I suggest tracking this metrics for each release of your software into production.
Tracking metrics in QA has been a fundamental activity for quite some time now. But often, development teams do not fully look at how relevant these metrics are in relation to all aspects of the business. For example, the typical tracked metrics such as defect ratios, validity, test productivity, code coverage etc. are usually evaluated in terms of the functional aspects of the software, but few pay attention to how they matter to the business aspects of software.
There are also other metrics that can add much value to the business aspects of the software, which is very important when an overall quality view of the software is looked at. These can be broadly classified into:
Needs of the beta users captured by business analysts, marketing and sales folks
End-user requirements defined by the product management team
Ensuring availability of the software at peak loads and ability of the software to integrate with enterprise IT systems
Support for high-volume transactions
Security aspects depending on the industry that the software serves
Availability of must-have and nice-to-have features in comparison to the competition
And a few more….
Code coverage percentage
If you're using Scrum, you want to know how each day's Scrum went. Are people getting done what they said they'd get done?
Personally, I'm bad at it. I chronically run over on my dailies.
Perhaps you can test CodeHealer
CodeHealer performs an in-depth analysis of source code, looking for problems in the following areas:
Audits Quality control rules such as unused or unreachable code,
use of directive names and
keywords as identifiers, identifiers
hiding others of the same name at a
higher scope, and more.
Checks Potential errors such as uninitialised or unreferenced
identifiers, dangerous type casting,
automatic type conversions, undefined
function return values, unused
assigned values, and more.
Metrics Quantification of code properties such as cyclomatic
complexity, coupling between objects
(Data Abstraction Coupling), comment
ratio, number of classes, lines of
code, and more.
Size and frequency of source control commits.

How to avoid the dangers of optimisation when designing for the unknown?

A two parter:
1) Say you're designing a new type of application and you're in the process of coming up with new algorithms to express the concepts and content -- does it make sense to attempt to actively not consider optimisation techniques at that stage, even if in the back of your mind you fear it might end up as O(N!) over millions of elements?
2) If so, say to avoid limiting cool functionality which you might be able to optimise once the proof-of-concept is running -- how do you stop yourself from this programmers habit of a lifetime? I've been trying mental exercises, paper notes, but I grew up essentially counting clock cycles in assembler and I continually find myself vetoing potential solutions for being too wasteful before fully considering the functional value.
Edit: This is about designing something which hasn't been done before (the unknown), when you're not even sure if it can be done in theory, never mind with unlimited computing power at hand. So answers along the line of "of course you have to optimise before you have a prototype because it's an established computing principle," aren't particularly useful.
I say all the following not because I think you don't already know it, but to provide moral support while you suppress your inner critic :-)
The key is to retain sanity.
If you find yourself writing a Theta(N!) algorithm which is expected to scale, then you're crazy. You'll have to throw it away, so you might as well start now finding a better algorithm that you might actually use.
If you find yourself worrying about whether a bit of Pentium code, that executes precisely once per user keypress, will take 10 cycles or 10K cycles, then you're crazy. The CPU is 95% idle. Give it ten thousand measly cycles. Raise an enhancement ticket if you must, but step slowly away from the assembler.
Once thing to decide is whether the project is "write a research prototype and then evolve it into a real product", or "write a research prototype". With obviously an expectation that if the research succeeds, there will be another related project down the line.
In the latter case (which from comments sounds like what you have), you can afford to write something that only works for N<=7 and even then causes brownouts from here to Cincinnati. That's still something you weren't sure you could do. Once you have a feel for the problem, then you'll have a better idea what the performance issues are.
What you're doing, is striking a balance between wasting time now (on considerations that your research proves irrelevant) with wasting time later (because you didn't consider something now that turns out to be important). The more risky your research is, the more you should be happy just to do something, and worry about what you've done later.
My big answer is Test Driven Development. By writing all your tests up front then you force yourself to only write enough code to implement the behavior you are looking for. If timing and clock cycles becomes a requirement then you can write tests to cover that scenario and then refactor your code to meet those requirements.
Like security and usability, performance is something that has to be considered from the beginning of the project. As such, you should definitely be designing with good performance in mind.
The old Knuth line is "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." O(N!) to O(poly(N)) is not a "small efficiency"!
The best way to handle type 1 is to start with the simplest thing that could possibly work (O(N!) cannot possibly work unless you're not scaling past a couple dozen elements!) and encapsulate it from the rest of the application so you could rewrite it to a better approach assuming that there is going to be a performance issue.
Optimization isn't exactly a danger; its good to think about speed to some extent when writing code, because it stops you from implementing slow and messy solutions when something simpler and faster would do. It also gives you a check in your mind on whether something is going to be practical or not.
The worst thing that can happen is you design a large program explicitly ignoring optimization, only to go back and find that your entire design is completely useless because it cannot be optimized without completely rewriting it. This never happens if you consider everything when writing it--and part of that "everything" is potential performance issues.
"Premature optimization is the root of all evil" is the root of all evil. I've seen projects crippled by overuse of this concept. At my company we have a software program that broadcasts transport streams from disk on the network. It was originally created for testing purposes (so we would just need a few streams at once), but it was always in the program's spec requirements that it work for larger numbers of streams so it could later be used for video on demand.
Because it was written completely ignoring speed, it was a mess; it had tons of memcpys despite the fact that they should never be necessary, its TS processing code was absurdly slow (it actually parsed every single TS packet multiple times), and so forth. It handled a mere 40 streams at a time instead of the thousands it was supposed to, and when it actually came time to use it for VOD, we had to go back and spend a huge amount of time cleaning it up and rewriting large parts of it.
"First, make it run. Then make it run fast."
or
"To finish first, first you have to finish."
Slow existing app is usually better than ultra-fast non-existing app.
First of all peopleclaim that finishign is only thing that matters (or almost).
But if you finish a product that has O(N!) complexity on its main algorithm, as a rule of thumb you did not finished it! You have an incomplete and unacceptable product for 99% of the cases.
A reasonable performance is part of a working product. A perfect performance might not be. If you finish a text editor that needs 6 GB of memory to write a short note, then you have not finished a product at all, you have only a waste of time at your hands.. You must remember always that is not only delivering code that makes a product complete, is making it achieve capability of supplying the costumer/users needs. If you fail at that it matters nothing that you have finished the code writing in the schedule.
So all optimizations that avoid a resulting useless product are due to be considered and applied as soon as they do not compromise the rest of design and implementation proccess.
"actively not consider optimisation" sounds really weird to me. Usually 80/20 rule works quite good. If you spend 80% of your time to optimize program for less than 20% of use cases, it might be better to not waste time unless those 20% of use-cases really matter.
As for perfectionism, there is nothing wrong with it unless it starts to slow you down and makes you miss time-frames. Art of computer programming is an act of balancing between beauty and functionality of your applications. To help yourself consider learning time-management. When you learn how to split and measure your work, it would be easy to decide whether to optimize it right now, or create working version.
I think it is quite reasonable to forget about O(N!) worst case for an algorithm. First you need to determine that a given process is possible at all. Keep in mind that Moore's law is still in effect, so even bad algorithms will take less time in 10 or 20 years!
First optimize for Design -- e.g. get it to work first :-) Then optimize for performance. This is the kind of tradeoff python programmers do inherently. By programming in a language that is typically slower at run-time, but is higher level (e.g. compared to C/C++) and thus faster to develop, python programmers are able to accomplish quite a bit. Then they focus on optimization.
One caveat, if the time it takes to finish is so long that you can't determine if your algorithm is right, then it is a very good time to worry about optimization earlier up stream. I've encountered this scenario only a few times -- but good to be aware of it.
Following on from onebyone's answer there's a big difference between optimising the code and optimising the algorithm.
Yes, at this stage optimising the code is going to be of questionable benefit. You don't know where the real bottlenecks are, you don't know if there is going to be a speed problem in the first place.
But being mindful of scaling issues even at this stage of the development of your algorithm/data structures etc. is not only reasonable but I suspect essential. After all there's not going to be a lot of point continuing if your back-of-the-envelope analysis says that you won't be able to run your shiny new application once to completion before the heat death of the universe happens. ;-)
I like this question, so I'm giving an answer, even though others have already answered it.
When I was in grad school, in the MIT AI Lab, we faced this situation all the time, where we were trying to write programs to gain understanding into language, vision, learning, reasoning, etc.
My impression was that those who made progress were more interested in writing programs that would do something interesting than do something fast. In fact, time spent worrying about performance was basically subtracted from time spent conceiving interesting behavior.
Now I work on more prosaic stuff, but the same principle applies. If I get something working I can always make it work faster.
I would caution however that the way software engineering is now taught strongly encourages making mountains out of molehills. Rather than just getting it done, folks are taught to create a class hierarchy, with as many layers of abstraction as they can make, with services, interface specifications, plugins, and everything under the sun. They are not taught to use these things as sparingly as possible.
The result is monstrously overcomplicated software that is much harder to optimize because it is much more complicated to change.
I think the only way to avoid this is to get a lot of experience doing performance tuning and in that way come to recognize the design approaches that lead to this overcomplication. (Such as: an over-emphasis on classes and data structure.)
Here is an example of tuning an application that has been written in the way that is generally taught.
I will give a little story about something that happened to me, but not really an answer.
I am developing a project for a client where one part of it is processing very large scans (images) on the server. When i wrote it i was looking for functionality, but i thought of several ways to optimize the code so it was faster and used less memory.
Now an issue has arisen. During Demos to potential clients for this software and beta testing, on the demo unit (self contained laptop) it fails due to too much memory being used. It also fails on the dev server with really large files.
So was it an optimization, or was it a known future bug. Do i fix it or oprtimize it now? well, that is to be determined as their are other priorities as well.
It just makes me wish I did spend the time to reoptimize the code earlier on.
Think about the operational scenarios. ( use cases)
Say that we're making a pizza-shop finder gizmo.
The user turns on the machine. It has to show him the nearest Pizza shop in meaningful time. It Turns out our users want to know fast: in under 15 seconds.
So now, any idea you have, you think: is this going to ever, realistically run in some time less than 15 seconds, less all other time spend doing important stuff..
Or you're a trading system: accurate sums. Less than a millisecond per trade if you can, please. (They'd probably accept 10ms), so , agian: you look at every idea from the relevant scenarios point of view.
Say it's a phone app: has to start in under (how many seconds)
Demonstrations to customers fomr laptops are ALWAYS a scenario. We've got to sell the product.
Maintenance, where some person upgrades the thing are ALWAYS a scenario.
So now, as an example: all the hard, AI heavy, lisp-customized approaches are not suitable.
Or for different strokes, the XML server configuration file is not user friendly enough.
See how that helps.
If I'm concerned about the codes ability to handle data growth, before I get too far along I try to set up sample data sets in large chunk increments to test it with like:
1000 records
10000 records
100000 records
1000000 records
and see where it breaks or becomes un-usable. Then you can decide based on real data if you need to optimize or re-design the core algorithms.