What are the most useful software development metrics? [closed] - process

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I would like to track metrics that can be used to improve my team’s software development process, improve time estimates, and detect special case variations that need to be addressed during the project execution.
Please limit each answer to a single metric, describe how to use it, and vote up the good answers.

(source: osnews.com)

ROI.
The total amount of revenue brought in by the software minus the total amount of costs to produce the software. Breakdown the costs by percentage of total cost and isolate your poorest performing and most expensive area in terms of return-on-investment. Improve, automate, or eliminate that problem area if possible. Conversely, find your highest return-on-investment area and find ways to amplify its effects even further. If 80% of your ROI comes from 20% of your cost or effort, expand that particular area and minimize the rest by comparison.
Costs will include payroll, licenses, legal fees, hardware, office equipment, marketing, production, distribution, and support. This can be done on a macro level for a company as whole or a micro level for a team or individual. It can also be applied to time, tasks, and methods in addition to revenue.
This doesn't mean ignore all the details, but find a way to quantify everything and then concentrate on the areas that yield the best (objective) results.

Inverse code coverage
Get a percentage of code not executed during a test. This is similiar to what Shafa mentioned, but the usage is different. If a line of code is ran during testing then we know it might be tested. But if a line of code has not been ran then we know for sure that is has not been tested. Targeting these areas for unit testing will improve quality and takes less time than auditing the code that has been covered. Ideally you can do both, but that never seams to happen.

"improve my team’s software development process": Defect Find and Fix Rates
This relates to the number of defects or bugs raised against the number of fixes which have been committed or verified.
I'd have to say this is one of the really important metrics because it gives you two things:
1. Code churn. How much code is being changed on a daily/weekly basis (which is important when you are trying to stabilize for a release), and,
2. Shows you whether defects are ahead of fixes or vice-versa. This shows you how well the development team is responding to defects raised by the QA/testers.
A low fix rate indicates the team is busy working on other things (features perhaps). If the bug count is high, you might need to get developers to address some of the defects.
A low find rate indicates either your solution is brilliant and almost bug free, or the QA team have been blocked or have another focus.

Track how long is takes to do a task that has an estimate against it. If they were well under, question why. If they are well over, question why.
Don't make it a negative thing, it's fine if tasks blow out or were way under estimated. Your goal is to continually improve your estimation process.

Track the source and type of bugs that you find.
The bug source represents the phase of development in which the bug was introduced. (eg. specification, design, implementation etc.)
The bug type is the broad style of bug. eg. memory allocation, incorrect conditional.
This should allow you to alter the procedures you follow in that phase of development and to tune your coding style guide to try to eliminate over represented bug types.

Velocity: the number of features per given unit time.
Up to you to determine how you define features, but they should be roughly the same order of magnitude otherwise velocity is less useful. For instance, you may classify your features by stories or use cases. These should be broken down so that they are all roughly the same size. Every iteration, figure out how many stories (use-cases) got implemented (completed). The average number of features/iteration is your velocity. Once you know your velocity based on your feature unit you can use it to help estimate how long it will take to complete new projects based on their features.
[EDIT] Alternatively, you can assign a weight like function points or story points to each story as a measure of complexity, then add up the points for each completed feature and compute velocity in points/iteration.

Track the number of clones (similar code snippets) in the source code.
Get rid of clones by refactoring the code as soon as you spot the clones.

Average function length, or possibly a histogram of function lengths to get a better feel.
The longer a function is, the less obvious its correctness. If the code contains lots of long functions, it's probably a safe bet that there are a few bugs hiding in there.

number of failing tests or broken builds per commit.

interdependency between classes. how tightly your code is coupled.

Track whether a piece of source has undergone review and, if so, what type. And later, track the number of bugs found in reviewed vs. unreviewed code.
This will allow you to determine how effectively your code review process(es) are operating in terms of bugs found.

If you're using Scrum, the backlog. How big is it after each sprint? Is it shrinking at a consistent rate? Or is stuff being pushed into the backlog because of (a) stuff that wasn't thought of to begin with ("We need another use case for an audit report that no one thought of, I'll just add it to the backlog.") or (b) not getting stuff done and pushing it into the backlog to meet the date instead of the promised features.

http://cccc.sourceforge.net/
Fan in and Fan out are my favorites.
Fan in:
How many other modules/classes use/know this module
Fan out:
How many other modules does this module use/know

improve time estimates
While Joel Spolsky's Evidence-based Scheduling isn't per se a metric, it sounds like exactly what you want. See http://www.joelonsoftware.com/items/2007/10/26.html

I especially like and use the system that Mary Poppendieck recommends. This system is based on three holistic measurements that must be taken as a package (so no, I'm not going to provide 3 answers):
Cycle time
From product concept to first release or
From feature request to feature deployment or
From bug detection to resolution
Business Case Realization (without this, everything else is irrelevant)
P&L or
ROI or
Goal of investment
Customer Satisfaction
e.g. Net Promoter Score
I don't need more to know if we are in phase with the ultimate goal: providing value to users, and fast.

number of similar lines. (copy/pasted code)

improve my team’s software development process
It is important to understand that metrics can do nothing to improve your team’s software development process. All they can be used for is measuring how well you are advancing toward improving your development process in regards to the particular metric you are using. Perhaps I am quibbling over semantics but the way you are expressing it is why most developers hate it. It sounds like you are trying to use metrics to drive a result instead of using metrics to measure the result.
To put it another way, would you rather have 100% code coverage and lousy unit tests or fantastic unit tests and < 80% coverage?
Your answer should be the latter. You could even want the perfect world and have both but you better focus on the unit tests first and let the coverage get there when it does.

Most of the aforementioned metrics are interesting but won't help you improve team performance. Problem is your asking a management question in a development forum.
Here are a few metrics: Estimates/vs/actuals at the project schedule level and personal level (see previous link to Joel's Evidence-based method), % defects removed at release (see my blog: http://redrockresearch.org/?p=58), Scope creep/month, and overall productivity rating (Putnam's productivity index). Also, developers bandwidth is good to measure.

Every time a bug is reported by the QA team- analyze why that defect escaped unit-testing by the developers.
Consider this as a perpetual-self-improvement exercise.

I like Defect Resolution Efficiency metrics. DRE is ratio of defects resolved prior to software release against all defects found. I suggest tracking this metrics for each release of your software into production.

Tracking metrics in QA has been a fundamental activity for quite some time now. But often, development teams do not fully look at how relevant these metrics are in relation to all aspects of the business. For example, the typical tracked metrics such as defect ratios, validity, test productivity, code coverage etc. are usually evaluated in terms of the functional aspects of the software, but few pay attention to how they matter to the business aspects of software.
There are also other metrics that can add much value to the business aspects of the software, which is very important when an overall quality view of the software is looked at. These can be broadly classified into:
Needs of the beta users captured by business analysts, marketing and sales folks
End-user requirements defined by the product management team
Ensuring availability of the software at peak loads and ability of the software to integrate with enterprise IT systems
Support for high-volume transactions
Security aspects depending on the industry that the software serves
Availability of must-have and nice-to-have features in comparison to the competition
And a few more….

Code coverage percentage

If you're using Scrum, you want to know how each day's Scrum went. Are people getting done what they said they'd get done?
Personally, I'm bad at it. I chronically run over on my dailies.

Perhaps you can test CodeHealer
CodeHealer performs an in-depth analysis of source code, looking for problems in the following areas:
Audits Quality control rules such as unused or unreachable code,
use of directive names and
keywords as identifiers, identifiers
hiding others of the same name at a
higher scope, and more.
Checks Potential errors such as uninitialised or unreferenced
identifiers, dangerous type casting,
automatic type conversions, undefined
function return values, unused
assigned values, and more.
Metrics Quantification of code properties such as cyclomatic
complexity, coupling between objects
(Data Abstraction Coupling), comment
ratio, number of classes, lines of
code, and more.

Size and frequency of source control commits.

Related

Software Metrics in Agile Methodologies [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Agile methodologies are rather prevalent these days, but I cannot seem to find much documentation on what metrics are most useful and why. I have found many more things saying that some traditional metrics like LOC and code coverage of tests are not appropriate, leaving two main questions:
Why are those two (and other) metrics inappropriate?
What metrics are best for Agile and why?
Even with an Agile process, wouldn't you want to know how much code coverage you have with your unit tests? Or is it simply that this metric (and others) just are not as useful as other metrics like cyclomatic complexity and velocity?
Agile is a business oriented thing, Agile is about maximizing the customer value while minimizing waste to provide the most optimal ROI. This is what should get measured. And to do so, I use the system that Mary Poppendieck recommends. This system is based on three holistic measurements that must be taken as a package:
Cycle time
From product concept to first release or
From feature request to feature deployment or
From bug detection to resolution
Business Case Realization (without this, everything else is irrelevant)
P&L or
ROI or
Goal of investment
Customer Satisfaction
e.g. Net Promoter Score
Sure, at the team level you can track things like test coverage, cyclomatic complexity, conformance to coding standards, etc, but high quality is not an end in itself, it's just a mean. Don't misinterpret me, I'm not saying high quality doesn't matters, high quality is mandatory to achieve sustainable pace (and we include "no increase of the technical debt" in our Definition of Done) but still, the goal is to deliver value to the customer in a fast and profitable way.
Irrespective of methodology, there are some basic metrics that can and should be used.
According to S. Kahn, the most important are the following three:
size of product
number of defects found in final phase of testing
and number of defects found in the field.
If those are all you track, there's at least five ways they can be used:
calculate product defect rate (A)
calculate test defect rate (B)
determine a desirable goal for A and monitor the performance
determine a desirable goal for B and monitor the performance
assess correlation between A and B
if correlation is found, form metric of test effectiveness (B/A * 100%)
Although not necessarily fun to read, Metrics and Models of Software Quality Engineering provides an excellent in-depth software engineering and metrics overview.
1.1) LOC are easy to answer
They are really dependent of the language you use! The same feature might have a big difference when written on JAVA or on Ruby, for example
A not well written software might have more lines than a good one!
1.2) Code coverage
IMHO you should use metric, although its not perfect, it should give you a nice understanding on where your code needs more tests.
Just one point you should take care here is that it is also dependent of the language. There could be some situations where you have a class or method that you really don't need to test! For example a class with only getters and setters.
2) From (1) you just mentioned code metrics, but judging from your question about velocity, you are interested on metrics on all the creation process, so I would list some:
Velocity: The classic one and, if used well, it can enhance quite well an agile team performance, since you will know what your team can really do on a fixed time.
Burn up and burn down charts : they can give you a good notion about how the team is performing during the interaction (sprint)
There are some articles on InfoQ about this. Here and here.
As for question 1, I don't see any reason those metrics would be bad in an Agile process.
LOC provides you with a relative size measurement. While it may not always be useful to compare numbers between projects, it can provide you with a rate of growth within the project. If you can get it, the number of lines changed within a sprint may be useful as well to track a rate or refactoring.
Code coverage (of lines of code) gives you a general sense of whether or not your team is meeting a minimum bar of automated testing within a project.
As for question 2, keep the items above and here are a few more:
LOC versus test count. If you can, maintain separate ratios for unit, integration and system tests.
Average number of acceptance criteria versus test scenarios (or tests) for each story. It can help provide a better sense of whether or not your testing against the story's intent.
Number of defects discovered
Amount of work discovered (this is often captured by Agile tracking software) that wasn't part original estimates. It will help you judge if you are doing 'enough' planning.
Tracking consistencies, or lack thereof, of velocity sprint to sprint
While probably not popular and probably potentially dangerous, tracking estimates to work completed for each developer. While teams are supposed to be self organized and driven, not all teams are capable of dealing with human problems.
Just to add
Why LOC and Code Coverage of Tests are less than ideal:
Agile emphasizes outcome, not output (see Agile Manifesto). These two simply track output. Also, they do not properly measure refactoring, which is a vital aspect of Agile processes.
Another metric to consider would be Running Tested Features. I can't describe any better than this: http://xprogramming.com/articles/jatrtsmetric/
I'm going to answer to this very old question...
LOC and Test coverage are, in my opinion, good metrics, but they have one big problem: if you push them, you can make them grow fastly, but the result will be terryifing: tons of nonsense code, or in the test coverage, you can invoque all your code in a try-catch block and not write one single assert... Or even worse, just write one for "compliance" reasons, but without any business-facing or code-facing meaning...
So, these kind of metrics are very good if they help the team to honestly evaluate their outcome, but are an evil tool if they form part of some "compliance" rules, as using them in that way causes more harm (dead code, bad tests!) than what you originally wanted to achieve.
So, with every metric, think how you would trick it if you were forced to achieve a certain value, and think of the consequences... This is not an issue of LOC or test coverage, many other metrics can have similar outcome, even cyclomatic complexity... If you divide your code in a bad manner, you can reduce cyclomatic complexity, but it doesn't mean you get better or more readable code!
So, these kind of metrics are quite good to see what's happening inside a team, but any measure you take should be based on concrete goals, not on the metric itself... For example:
Test coverage is low: you implement coding dojos once a month to help train people to write testable code, you find out what code has the worst test coverage and try to implement a better / more testable architecture that helps / motivates developers to write test, etc.
As you can see, you never tell the team to achieve a certain value of test coverage, you just use the metric to see where you can improve and then look for measures that benefit your process, after a time you would expect test coverage to increase, but you are not pushing people to do so! You are evaluating changes in order to see if the measures are helping. If after a time you find out that test coverage has not changed with your measures, then it's time to look for other ideas, and so on...

estimating of testing effort as a percentage of development time [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Does anyone use a rule of thumb basis to estimate the effort required for testing as a percentage of the effort required for development? And if so what percentage do you use?
From my experience, 25% effort is spent on Analysis; 50% for Design, Development and Unit Test; remaining 25% for testing. Most projects will fit within a +/-10% variance of this rule of thumb depending on the nature of the project, knowledge of resources, quality of inputs & outputs, etc. One can add a project management overhead within these percentages or as an overhead on top within a 10-15% range.
The Google Testing Blog discussed this problem recently:
So a naive answer is that writing test carries a 10% tax. But, we pay taxes in order to get something in return.
(snip)
These benefits translate to real value today as well as tomorrow. I write tests, because the additional benefits I get more than offset the additional cost of 10%. Even if I don't include the long term benefits, the value I get from test today are well worth it. I am faster in developing code with test. How much, well that depends on the complexity of the code. The more complex the thing you are trying to build is (more ifs/loops/dependencies) the greater the benefit of tests are.
When you're estimating testing you need to identify the scope of your testing - are we talking unit test, functional, UAT, interface, security, performance stress and volume?
If you're on a waterfall project you probably have some overhead tasks that are fairly constant. Allow time to prepare any planning documents, schedules and reports.
For a functional test phase (I'm a "system tester" so that's my main point of reference) don't forget to include planning! A test case often needs at least as much effort to extract from requirements / specs / user stories as it will take to execute. In addition you need to include some time for defect raising / retesting. For a larger team you'll need to factor in test management - scheduling, reporting, meetings.
Generally my estimates are based on the complexity of the features being delivered rather than a percentage of dev effort. However this does require access to at least a high-level set of instructions. Years of doing testing enables me to work out that a test of a particular complexity will take x hours of effort for preparation and execution. Some tests may require extra effort for data setup. Some tests may involve negotiating with external systems and have a duration far in excess of the effort required.
In the end, though, you need to review it in the context of the overall project. If your estimate is well above that for BA or Development then there may be something wrong with your underlying assumptions.
I know this is an old topic but it's something I'm revisiting at the moment and is of perennial interest to project managers.
Some years ago, in a safety critical field, I have heard something like one day for unit testing ten lines of code.
I have also observed 50% of effort for development and 50% for testing (not only unit testing).
Are you talking about automated unit/integration tests or manual tests?
For the former, my rule of thumb (based on measurements) is 40-50% added to development time i.e. if developing a use case takes 10 days (before an QA and serious bugfixing happens), writing good tests takes another 4 to 5 days - though this should best happen before and during development, not afterwards.
When you speak of tests, you could mean waterfall or agile test development. In an agile environment, developers should spend 50% of their time developing and maintaining tests.
But that 50% extra will save you time when the re-factoring and manual verification time comes.
Testing time is probably more closely correlated to feature scope than development time. I'd also argue (perhaps controversially) that testing time is correlated to the skill of your development team.
For a 6-to-9 month development effort, I demand a absolute minimum of 2 weeks testing time, performed by actual testers (not the development team) who are well-versed in the software they will be testing (i.e., 2 weeks does not include ramp-up time). This is for a project that has ~5 developers.
Gartner in Oct 2006 states that testing typically consumes between 10% and 35% of work on a system integration project. I assume that it applies to the waterfall method. This is quite a wide range - but there are many dependencies on the amount of customisations to a standard product and the number of systems to be integrated.
The only time I factor in extra time for testing is if I'm unfamiliar with the testing technology I'll be using (e.g. using Selenium tests for the first time). Then I factor in maybe 10-20% for getting up to speed on the tools and getting the test infrastructure in place.
Otherwise testing is just an innate part of development and doesn't warrant an extra estimate. In fact, I'd probably increase the estimate for code done without tests.
EDIT: Note that I'm usually writing code test-first. If I have to come in after the fact and write tests for existing code that's going to slow things down. I don't find that test-first development slows me down at all except for very exploratory (read: throw-away) coding.
Judge by yesterday's weather. How long did it take last time? Are you trending longer or shorter? Each shop is different.
Most agile shops need a lot less time, have drastically fewer defects, and quicker time to resolve them because of TDD. Even so, most agile shops have some measurable time spent with testing/QC.
If this is the first test run for this application, then the answer is "lets see" followed by an attempt. It depends on how quick you can get questions answered,
- how testable it is,
- how many features/functions
- how many defects are discovered,
- how quickly issues are resolved,
- how many times the code cycles
through testing, and
- how many times testing is blocked by
bugs.
There is no way to tell. You could call it 50% or 175% or more, and not be wrong. Why not make a rough guess and multiply by Pi? It won't be much worse than any other answer you can make up.
You should (must) know how long it takes now and whether it's getting faster or slower, and whether the coverage is increasing or decreasing. With those three bits of information, you should be able to guess quite well.

Profiling a VxWorks system

We've got a fairly large application running on VxWorks 5.5.1 that's been developed and modified for around 10 years now. We have some simple home-grown tools to show that we are not using too much memory or too much processor, but we don't have a good feel for how much headroom we actually have. It's starting to make it difficult to do estimates for future enhancements.
Does anybody have any suggestions on how to profile such a system? We've never had much luck getting the Wind River tools to work.
For bonus points: the other complication is that our system has very different behaviors at different times; during start-up it does a lot of stuff, then it sits relatively idle except for brief bursts of activity. If there is a profiler with some programmatic way to have to record state information, I think that'd be very useful too.
FWIW, this is compiled with GCC and written entirely in C.
I've done a lot of performance tuning of various kinds of software, including embedded applications. I won't discuss memory profiling - I think that is a different issue.
I can only guess where the "well-known" idea originated that to find performance problems you need to measure performance of various parts. That is a top-down approach, similar to the way governments try to control budget waste, by subdividing. IMHO, it doesn't work very well.
Measurement is OK for seeing if what you did made a difference, but it is poor at telling you what to fix.
What is good at telling you what to fix is a bottom-up approach, in which you examine a representative sample of microscopic units of what is being spent, and finding out the full explanation of why each one is being spent. This works for a simple statistical reason. If there is a reason why some percent (for example 40%) of samples can be saved, on average 40% of samples will show it, and it doesn't require a huge number of samples. It does require that you examine each sample carefully, and not just sort of aggregate them into bigger bunches.
As a historical example, this is what Harry Truman did at the outbreak of the U.S. involvement in WW II. There was terrific waste in the defense industry. He just got in his car, drove out to the factories, and interviewed the people standing around. Then he went back to the U.S. Senate, explained what the problems were exactly, and got them fixed.
Maybe this is more of an answer than you wanted. Specifically, this is the method I use, and this is a blow-by-blow example of it.
ADDED: I guess the idea of finding-by-measuring is simply natural. Around '82 I was working on an embedded system, and I needed to do some performance tuning. The hardware engineer offered to put a timer on the board that I could read (providing from his plenty). IOW he assumed that finding performance problems required timing. I thanked him and declined, because by that time I knew and trusted the random-halt technique (done with an in-circuit-emulator).
If you have the Auxiliary Clock available, you could use the SPY utility (configurable via the config.h file) which does give you a very rough approximation of which tasks are using the CPU.
The nice thing about it is that it does not require being attached to the Tornado environment and you can use it from the Kernel shell.
Otherwise, btpierre's suggestion of using taskHookAdd has been used successfully in the past.
I've worked on systems that have had luck using locally-built monitoring utilities based on taskSwitchHookAdd and related functions (delete hook, etc).
"Simply" use this to track the number of ticks a given task runs. I realize that this is fairly gross scale information for profiling, but it can be useful depending on your needs.
To see how much cpu% each task is using, calculate the percentage of ticks assigned to each task.
To see how much headroom you have, add a lowest priority "idle" task that just does "while(1){}", and see how much cpu% it is assigned to it. Roughly speaking, that's your headroom.

Given this expectations, what language or system would you choose to implement the solution? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Here are the estimates the system should handle:
3000+ end users
150+ offices around the world
1500+ concurrent users at peak times
10.000+ daily updates
4-5 commits per second
50-70 transactions per second (reads/searches/updates)
This will be internal only business application, dedicated to help shipping company with worldwide shipment management.
What would be your technology choice, why that choice and roughly how long would it take to implement it? Thanks.
Note: I'm not recruiting. :-)
So, you asked how I would tackle such a project. In the Smalltalk world, people seem to agree that Gemstone makes things scale somewhat magically.
So, what I'd really do is this: I'd start developing in a simple Squeak image, using SandstoneDB. Then, this moment would come where a single image begins being too slow.
GemStone then takes care of copying your public objects (those visible from a certain root) back and forth between all instances. You get sessions and enhanced query functionalities, plus quite a fast VM.
It shares data with C, Java and Ruby.
In fact, they have their own VM for ruby, which is also worth a look.
wikipedia manages much more demanding requirements with MySQL
Your volumes are significant but not likely to strain any credible RDBMS if programmed efficiently. If your team is sloppy (i.e., casually putting SQL queries directly into components which are then composed into larger components), you face the likelihood of a "multiplier" effect where one logical requirement (get the data necessary for this page) turns into a high number of physical database queries.
So, rather than focussing on the capacity of your RDBMS, you should focus on the capacity of your programmers and the degree to which your implementation language and environment facilitate profiling and refactoring.
The scenario you propose is clearly a 24x7x365 one, too, so you should also consider the need for monitoring / dashboard requirements.
There's no way to estimate development effort based on the needs you've presented; it's great that you've analyzed your transactions to this level of granularity, but the main determinant of development effort will be the domain and UI requirements.
Choose the technology your developers know and are familiar with. All major technologies out there will handle such requirements with ease.
Your daily update numbers vs commits do not add up. Four commits per second = 14,400 per hour.
You did not mention anything about expected database size.
In any case, I would concentrate my efforts on choosing a robust back end like Oracle, Sybase, MS etc. This choice will make the most difference in performance. The front end could either be a desktop app or WEB app depending on needs. Since this will be used in many offices around the world, a WEB app might make the most sense.
I'd go with MySQL or PostgreSQL. Not likely to have problems with either one for your requirements.
I love object-databases. In terms of commits-per-second and database-roundtrip, no relational database can hold up. Check out db4o. It's dead easy to learn, check out the examples!
As for the programming language and UI framework: Well, take what your team is good at. Dynamic languages with fewer meta-time wasting will probably save time.
There is not enough information provided here to give a proper recommendation. A little more due diligence is in order.
What is the IT culture like? Do they prefer lots of little servers or fewer bigger servers or big iron? What is their position on virtualization?
What is the corporate culture like? What is the political climate like? The open source offerings may very well handle the load but you may need to go with a proprietary vendor just because they are already used to navigating the political winds of a large company. Perception is important.
What is the maturity level of the organization? Do they already have an Enterprise Architecture team in place? Do they even know what EA is?
You've described the operational side but what about the analytical side? What OLAP technology are they expecting to use or already have in place?
Speaking of integration, what other systems will you need to integrate with?

How to gauge the quality of a software product

I have a product, X, which we deliver to a client, C every month, including bugfixes, enhancements, new development etc.) Each month, I am asked to err "guarantee" the quality of the product.
For this we use a number of statistics garnered from the tests that we do, such as:
reopen rate (number of bugs reopened/number of corrected bugs tested)
new bug rate (number of new, including regressions, bugs found during testing/number of corrected bugs tested)
for each new enhancement, the new bug rate (the number of bugs found for this enhancement/number of mandays)
and various other figures.
It is impossible, for reasons we shan't go into, to test everything every time.
So, my question is:
How do I estimate the number and type of bugs that remain in my software?
What testing strategies do I have to follow to make sure that the product is good?
I know this is a bit of an open question, but hey, I also know that there are no simple solutions.
Thanks.
I don't think you can ever really estimate the number of bugs in your app. Unless you use a language and process that allows formal proofs, you can never really be sure. Your time is probably better spent setting up processes to minimize bugs than trying to estimate how many you have.
One of the most important things you can do is have a good QA team and good work item tracking. You may not be able to do full regression testing every time, but if you have a list of the changes you've made to the app since the last release, then your QA people (or person) can focus their testing on the parts of the app that are expected to be affected.
Another thing that would be helpful is unit tests. The more of your codebase you have covered the more confident you can be that changes in one area didn't inadvertently affected another area. I've found this quite useful, as sometimes I'll change something and forget that it would affect another part of the app, and the unit tests showed the problem right away. Passed unit tests won't guarantee that you haven't broken anything, but they can help increase confidence that changes you make are working.
Also, this is a bit redundant and obvious, but make sure you have good bug tracking software. :)
The question is who requires you to provide the stats.
If it's non-technical people, fake the stats. By "fake", I mean "provide any inevitably meaningless, but real numbers" of the kind you mentioned.
If it's technical people without a CS background, they ought to be told about the halting problem, which is undecidable and is simpler than counting and classifying the remaining bugs.
There's a lot of metrics and tools regarding software quality (code coverage, cyclomatic complexity, coding guidelines and tools enforcing them, etc.). In practice, what works is automating as much tests as possible, having human testers do as many tests that weren't automated as possible, and then pray.
I think keeping it simple is the best way to go. Categorize your bugs by severity, and address them in order of decreasing severity.
This way you can hand over the highest-quality build possible (the number of significant bugs remaining is how I would gauge the quality of the product, as opposed to some complex statistics).
Most of the agile methodologies address this dilemma pretty clearly. You can't test everything. Neither can you test it infinite number of times before you release. So the procedure is to rely on the risk and likelihood of the bug. Both risk and likelihood are numerical values. The product of both gives you a RPN number. If the number is less than 15 you ship a beta. If you can bring it down to less than 10 you ship the product and push the bug to be fixed in a future releasee.
How to calculate risk ?
If its a crash then its a 5
If its a crash but you can provide a work around then its a number less than 5.
If the bug reduces the functionality then its a 4
How to calculate likelihood ?
can you re-produce it every time you run, its a 5.
If the work around provided still causes it to crash then less than 5
Well, I am curious to know whether anyone else using this scheme and eager to know their milage on this.
How long is a piece of string? Ultimately what makes a quality product? Bugs gives some indication yes, but many other factors are involved, Unit Test coverage is a key factor in IMO. But in my experience the main factor that effects whether a product can be deemed quality or not, is good understanding of the problem that is being solved. Often what happens is, the 'problem' that the product is meant to solve is not understood correctly and developers end up inventing the solution to a problem they have flesh out in their head, and not the real problem, thus 'bugs' are made. I am a strong proponent of iterative Agile development, that way the product is constantly access against the 'problem' and the product does not stray to far from its goal.
The questions I heard wer, how do I estimate the bugs in my software? and what techniques do I use to ensure the quality is good?
Rather than go through a full course, here are a couple approaches.
How do I estimate the bugs in my software?
Start with the history, you know how many you found during testing (hopefully) and you know how many were found after the fact. You can use that to estimate how efficient you are at finding bugs (DDR - Defect Detection Rate is one name for this). If you can show that for some consistent time period, your DDR is consistent (or improving) you can provide some insight into the quality of the release by guessing at the number of post-release defects that will be found once the product is released.
What techniques do I use to ensure the quality is good?
Root cause analysis on your bugs will point you to specific components that are buggy, specific developers that create buggy code, the fact that lacking full requirements results in implementation not matching expectations, etc.
Project Review meetings to quickly identify what was good, so those things can be repeated and what was bad and find a way to not do those again.
Hopefully, these give you a good start. Good Luck!
It seems the consensus is that the emphasis should be placed on unit testing. Bug tracking is a good indicator of the product quality, but is only is acurate as your test team. If you employ unit testing it gives you a measurable metric of code coverage and provides regression testing so you can be assured you didn't break anything since last month.
My company relies on system/integration level testing. I see alot of defects being introduced because there is a lack of regression testing. I think "bugs" where the developer's implementation of the requirements deviates from the user's vision is sort of a seperate problem that as Dan and rptony stated is best addressed by Agile methodologies.