gcov/lcov + googletest create an artificially low branch coverage report

gcov/lcov + googletest create an artificially low branch coverage report - googletest

First, I am well aware of the "hidden branch" problem caused by throws/exceptions. This is not that.
What I am observing is:
My test framework (googletest) has testing macros (EXPECT_TRUE for example).
I write passing tests using the macros
Measuring branch coverage now asymptotes at 50% because I have not evaluated that test in both a passing and a failing condition...
Consider the following:
TEST (MyTests, ContrivedTest)
{
EXPECT_TRUE(function_that_always_returns_true());
}
Now assuming that I have every line and every branch perfectly covered in function_that_always_returns_true(), this branch coverage report will asymptote at 50% (because gcov does not observe line 3 evaluating in a failing condition, intentionally)
The only idea that I've had around this issue is that I could exclude the evaluation macros with something like LCOV_EXCL_BR_LINE, but this feels both un-ergonomic and hacky.
TEST (MyTests, ContrivedTest)
{
bool my_value = function_that_always_returns_true();
EXPECT_TRUE(my_value); //LCOV_EXCL_BR_LINE
}
This cannot be a niche problem, and I have to believe that people successfully use googletest with lcov/gcov. What do people do to get around this limitation?

After looking for far too long, I realized that all the testing calls I want to filter out are of the pattern EXPECT_*. So simply adding:
lcov_excl_br_line=LCOV_EXCL_BR_LINE|EXPECT_*
to my lcovrc solved my problem

Related

Why is ScoreManager changing the solution to a worse score?

I'm using solverManager to continually save the best score:
solverManager.solveAndListen(
SINGLETON_TIME_TABLE_ID,
this::findById,
this::save
)
My save() method just updates a global reference to the best solution and nothing else happens to it.
However, when I go to retrieve the best solution and get the score details, it does not always get the best solution:
val solution: MySolution = findById(SINGLETON_TIME_TABLE_ID)
println(solution.score) // prints 1100
scoreManager.updateScore(solution) // Sets the score
println(solution.score) // prints 1020 (it's now worse than before)
Simply calling scoreManager.updateScore(solution) seems to bring back a worse solution. In the above example, the score might go from 1100 down to 1020. Is there an obvious explanation for this?
I'm using SimpleScore and this happens well after all planning variables are assigned.
I'm using a variable listener ArrivalTimeUpdatingVariableListener : VariableListener.
Not sure what else is relevant. Thanks

This has all warning signs of a score corruption. Please run your solver for a couple minutes with <environmentMode>FULL_ASSERT</environmentMode>. If you'll see exceptions being thrown, you know that your constraints have a bug in them. What that bug is, that is impossible for me to know unless I see those constraints.

Factorial code coverage

In testing a recursive function such as the following factorial method; is it necessary to test the default case? If I pass in 3 the output is 7 and code coverage reports show 100%. However, I didn't explicitly test factorial(0). What are your thoughts on this?
public class Factorial
{
public static double factorial(int x)
{
if (x == 0)
{
return 1.0;
}
return x + factorial(x - 1);
}
}

Code coverage doesn't tell you everything. In this case you'll get 100% line coverage for factorial(3) but it won't cover all "cases".
When testing a recursive function you'd want to test various cases:
Each of the base cases.
The recursive cases.
Incorrect input (e.g., negative numbers).
Any function-specific edge cases.
You can test less but you'll leave yourself open for potential bugs when the code is changed in the future.

Technically if you test[well, execute] factorial(2) you also test [execute] factorial(1) if your algorithm is correct.
How would your testing tool know that your code was correct? How would a tester that didn't write the code know your code was correct? The point of "test coverage" is to determine that in fact as much of the code has been tested [well, executed] regardless of whether it is algorithmically correct or not.
A line not executed is a line for which you have no evidence it works. This is what test coverage tells your. The purpose of your tests is to check that the application computes the correct answer. Coverage and tests serve two different purposes. Coverage just happens to take advantage of the fact that tests exercise code.

Data Flow Coverage

If you write a program, it is usually possible to drive it such that all paths are covered. Hence, 100% coverage is easy to obtain (ignoring unfeasible code paths which modern compilers catch anyways).
However, 100% code coverage should imply that all variable definition-use coverage is also achieved, because variables are defined within the program and used within it. If all code is covered, all DU pairs should also be covered.
Why then, is it said that path coverage is easier to obtain, but data flow coverage is not usually possible to achieve 100% ? I do not understand why not? What can be an example of that?

It's easier to achieve 100% code coverage than all of the possible inputs because the set of all possible inputs can be extremely large or practically unlimited. It would take too much time to test them all.
Let's look at a simple example function:
double invert(double x) {
return 1.0/x;
}
A unit test would could look like this:
double y = invert(5);
double expected = 1.0/5.0;
EXPECT_EQ( expected, y );
This test achieves 100% code coverage. However, it's only 1 in 1.8446744e+19 possible inputs (assuming a double is 64 bits wide).
The idea behind All-pairs Testing is that it's not practical to test every possible input, so we have to identify the ranges that would cover all cases.
With my invert() function, there are at least two sets that matter: {non-zero values} and {zero}.
We need to add another test, which covers the same code path, but has a different outcome:
EXPECT_THROWS( invert(0.0) );
Furthermore, since the test writer has to design the different possible sets of parameters to achieve full data input coverage to a test, it could be impossible to know what the correct sets are.
Consider this function:
double multiply(double x, double y);
My instinct would be to write tests for small numbers and another for big numbers, to test overflow.
However, the developer may have written it poorly, in this way:
double multiply(double x, double y) {
if(x==0) return 0;
return 1.0 / ( (1.0/x) * (1.0/y) );
}
If our tests didn't use 0 for y, then we'd miss a bug. Knowledge of how the algorithms are designed is very important in understanding the proper inputs for a unit test, and that's why the programmers who write the code need to be involved in unit testing.

In any program doesn't 100% statement coverage imply 100 % branch coverage?

While solving MCQs for a practice test I came across this statement - "In any program 100% statement coverage implies 100 % branch coverage" and it is termed as incorrect. I think its a correct statement because if we cover all the statements then it means we also cover all the paths and hence all the branches. Could someone please shed more light on this one?

Consider this code:
...
if (SomeCondition) DoSomething();
...
If SomeCondition is always true, you can have 100% statement coverage (SomeCondition and DoSomething() will be covered), but you never exercise the case when the condition is false, when you skip DoSomething().

Below example, a = true will cover 100% of statements, but fails to test the branch where a division by zero fault is possible.
int fun(bool a){
int x = 0;
if (a) x =1;
return 100/x;
}

For a test set to achieve 100% branch coverage, every branching point in the code must have been taken in each direction, at least once.
The archetypical example, showing that 100% statement coverage does not imply 100% branch coverage, was already given by Alexey Frunze. It is a consequence of the fact that (at least in the majority of programming languages) it is possible to have branches that do not involve statements (such a branch basically skips the statements in the other branch).
The reason for wanting 100% branch coverage, rather than just 100% statement coverage, is that your tests must also show that skipping some statements works as expected.
My main reason for providing this answer is to point out that the converse, viz. "100% branch coverage implies 100% statement coverage" is correct.

Just because you cover every statement doesnt mean that you covered every branch the program could have taken.
you have to look at every possible branch, not just the statements inside every branch.

Creating robust real-time monitors for variables

We can create a real-time monitor for a variable like this:
CreatePalette#Panel#Row[{"x = ", Dynamic[x]}]
(This is more interesting and useful if x happens to be something like $Assumptions. It's so easy to set a value and then forget about it.)
Unfortunately this stops working if the kernel is re-launched (Quit[], then evaluate something). The palette won't show changes in the value of x any more.
Is there a way to do this so it keeps working even across kernel sessions? I find myself restarting the kernel quite often. (If the resulting palette causes the kernel to be automatically started after Quit that's fine.)
Update: As mentioned in the comments, it turns out that the palette ceases working only if we quit by evaluating Quit[]. When using Evaluation -> Quit Kernel -> Local, it will keep working.
Link to same question on MathGroup.

I can only guess, because on my Ubuntu here the situations seems buggy. The trick with the Quit from the menu like Leonid suggested did not work here. Another one is: on a fresh Mathematica session with only one notebook open:
Dynamic[x]
x = 1
Dynamic[x]
x = 2
gives as expected
2
1
2
2
Typing in the next line Quit, evaluating and typing then x=3 updates only the first of the Dynamic[x].
Nevertheless, have you checked the command
Internal`GetTrackedSymbols[]
This gives not only the tracked symbols but additionally some kind of ID where the dynamic content belongs. If you can find out, what exactly these numbers are and investigate in the other functions you find in the Internal context, you may be able to add your palette Dynamic-content manually after restarting the kernel.
I thought I had something like that with
Internal`SetValueTrackExtra
but I'm currently not able to reproduce the behavior.

#halirutan's answer jarred my memory...
Have you ever come across: Experimental/ref/ValueFunction? (documentation address)
Although the documentation contains no examples, the 'more information' section provides the following tidbit:
The assignment ValueFunction[symb] = f specifies that whenever
symb gets a new value val, the expression f[symb,val] should be
evaluated.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

gcov/lcov + googletest create an artificially low branch coverage report - googletest

After looking for far too long, I realized that all the testing calls I want to filter out are of the pattern EXPECT_. So simply adding: lcov_excl_br_line=LCOV_EXCL_BR_LINE|EXPECT_ to my lcovrc solved my problem

Related

Why is ScoreManager changing the solution to a worse score?

Factorial code coverage

Data Flow Coverage

In any program doesn't 100% statement coverage imply 100 % branch coverage?

Creating robust real-time monitors for variables

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

gcov/lcov + googletest create an artificially low branch coverage report - googletest

After looking for far too long, I realized that all the testing calls I want to filter out are of the pattern EXPECT_*. So simply adding: lcov_excl_br_line=LCOV_EXCL_BR_LINE|EXPECT_* to my lcovrc solved my problem

Related

Why is ScoreManager changing the solution to a worse score?

Factorial code coverage

Data Flow Coverage

In any program doesn't 100% statement coverage imply 100 % branch coverage?

Creating robust real-time monitors for variables

Categories

Resources

After looking for far too long, I realized that all the testing calls I want to filter out are of the pattern EXPECT_. So simply adding: lcov_excl_br_line=LCOV_EXCL_BR_LINE|EXPECT_ to my lcovrc solved my problem