Minimal set of test cases with modified condition/decision coverage - testing

I have a question regarding modified condition/decision coverage that I can't figure out.
So if I have the expression ((A || B) && C) and the task is with a minimal number of test cases receive 100% MD/DC.
I break it down into two parts with the minimal number of test cases for (A || B) and (X && C).
(A || B) : {F, F} = F, {F, T} = T, {T, -} = T
(X && C) : {F, -} = F, {T, F} = F, {T, T} = T
The '-' means that it doesn't matter which value they are since they won't be evaluated by the compiler.
So when I combine these I get this as my minimal set of test cases:
((A || B) && C) : {{F, F}, -} = F, {{F, T}, F} = F, {{T, -}, T} = T
But when I google it this is also in the set: {{F, T}, T} = T
Which I do not agree on because I tested the parts of this set separately in the other tests, didn't I?
So I seem to miss what the fourth test case adds to the set and it would be great if someone could explain why I must have it?

First of all, a little explanation on MCDC
To achieve MCDC you need to find at least one Test Pair for each condition in your boolean expression that fulfills the MCDC criterion.
At the moment there are 3 types of MCDC defined and approved by certification bodies (e.g. FAA)
"Unique Cause" – MCDC (original definition): Only one, specifically the influencing condition may change between the two test values of the test pair. The resulting decision, the result of the boolean expression, must also be different for the 2 test values of the test pair. All other conditions in the test pair must be fixed and unchanged.
"Masking" – MCDC”: Only one condition of the boolean expression having an influence on the outcome of the decision may change. Other conditions may change as well, but must be masked. Masked means, using the boolean logic in the expression, they will not have an influence on the outcome. If the left side of an AND is FALSE, then the complete rightside expression and sub expressions do not matter. They are masked. Masking is a relaxation of the “Unique Cause”MCDC.
"Unique Cause + Masking" – MCDC. This is a hybrid and especially needed for boolean expressions with strongly coupled conditions like for example in “ab+ac”. It is not possible to find a test pair for condition “a” the fulfills "Unique Cause" – MCDC. So, we can relax the original definition, and allow masking for strongly coupled conditions.
With these 2 additional definitions much more test pairs can be found.
Please note additionally that, when using languages with a boolean short cut evaluation strategy (like Java, C, C++), there are even more test pairs fulfilling MCDC criteria.
Very important is to understand that a BlackBox view on a truth table does not allow to find any kind of Masking or boolean short cut evaluation.
MCDC is a structural coverage metric and so a WhiteBox View on the boolean expression is absolutely mandatory.
So, now to your expression: “(a+b)c”. A brute force analysis with boolean shortcut evaluation will give you the following test pairs for your 3 conditions a,b,c:
Influencing Condition: 'a' Pair: 0, 5 Unique Cause
Influencing Condition: 'a' Pair: 0, 7 Unique Cause
Influencing Condition: 'a' Pair: 1, 5 Unique Cause
Influencing Condition: 'a' Pair: 1, 7 Unique Cause
Influencing Condition: 'b' Pair: 0, 3 Unique Cause
Influencing Condition: 'b' Pair: 1, 3 Unique Cause
Influencing Condition: 'c' Pair: 2, 3 Unique Cause
Influencing Condition: 'c' Pair: 2, 5 Masking
Influencing Condition: 'c' Pair: 2, 7 Masking
Influencing Condition: 'c' Pair: 3, 4 Masking
Influencing Condition: 'c' Pair: 3, 6 Masking
Influencing Condition: 'c' Pair: 4, 5 Unique Cause
Influencing Condition: 'c' Pair: 4, 7 Unique Cause
Influencing Condition: 'c' Pair: 5, 6 Unique Cause
Influencing Condition: 'c' Pair: 6, 7 Unique Cause
Without a white box view, you will never find “Unique Cause” – MCDC test pairs like 0,7 for condition “a”. Also, you will not find any of the valid "Masking" – MCDC” test pairs.
The next step is now to find one minimum test set. This can be done by applying the “set cover” or “unate covering” algoritm. From above test pairs we can create a table that shows which test value covers what condition. In your case this looks like the following:
0 1 2 3 4 5 6 7
a X X X X
b X X X
c X X X X X X
Simple elimination of double columns will already reduce the table to
0 1 2 3 4 5 6 7 0 3 5
a X - X - ==> a X X
b X - X b X X
c - X - X - - c X X
Please Note: We employed some heuristics to select, which of completely identical columns should be eliminated. So effectively there are more solutions to the set cover problem. But with the geometrically growth on computing time and memory consumption related to the number of conditions, this is the only meaningful approach.
Now we will apply Petrick’s method and find out all coverage sets:
1. 0, 3
2. 0, 5
3. 3, 5
That means, we need to select, out of the above test pair list, test pairs for a,b,c that contain, for solution 1, 0 and 3, for solution 2, 0 and 5, and for solution 3 3 and 5
Also here we will apply some heuristic functions to come up with the minimum solutions:
-------- For Coverage set 1
Test Pair for Condition 'a': 0 5 (Unique Cause)
Test Pair for Condition 'b': 0 3 (Unique Cause)
Test Pair for Condition 'c': 2 3 (Unique Cause)
Resulting Test Vector: 0 2 3 5
-------- For Coverage set 2
Test Pair for Condition 'a': 0 5 (Unique Cause)
Test Pair for Condition 'b': 0 3 (Unique Cause)
Test Pair for Condition 'c': 5 6 (Unique Cause)
Resulting Test Vector: 0 3 5 6
-------- For Coverage set 3
Test Pair for Condition 'a': 1 5 (Unique Cause)
Test Pair for Condition 'b': 1 3 (Unique Cause)
Test Pair for Condition 'c': 5 6 (Unique Cause)
Resulting Test Vector: 1 3 5 6
---------------------------------------
Test vector: Recommended Result: 0 2 3 5
0: a=0 b=0 c=0 (0)
2: a=0 b=1 c=0 (0)
3: a=0 b=1 c=1 (1)
5: a=1 b=0 c=1 (1)
You see that you have for all 3 possible solutions 4 test cases that cover all conditions and that fulfill MCDC criteria. If you check publications in the internet, then you will see that the minimum number of tests is n+1 (n = number of conditions). This sounds not like a great achievement. But looking at MCC (Multiple Condition Coverage) you will have 2^n test cases. So, for 8 conditions 256 test cases. With MCDC there will be only 9. That was one reason to come up with MCDC.
I created also a software to help to better understand MCDC coverage. It uses a boolean expression as input, simplifies that (if you wish) and calculates all MCDC test pairs.
You may find it here:
MCDC
I hope, I could shed some light on the topic. I am happy to answer further questions.

Recall that for MC/DC you need for every condition P (A/B/C in your case) two test cases T and T' so that P is true in T and false in T', and so that the outcome of the predicate is true in one and false in the other test case.
An MC/DC cover for ((A || B) && C) is:
T1: (F, F, T) -> F (your first test case)
T2: (F, T, T) -> T (B flips outcome compared to T1, missing)
T3: (T, F, T) -> T (A flips outcome compared to T1, your third test case)
T4: (F, T, F) -> F (C flips outcome compared to T2, your second test case)
In concrete test cases you cannot have "-" / don't-care values: You must make a choice when you exercise the system.
So what you are missing in your answer is a pair of two test cases (T1 and T2) in which flipping only the second condition B also flips the outcome.

For MCDC to be achieved it requires the output to change when there is a change in only 1 input while the other inputs remain constant. For example in your case
MCDC for (A || B) & C
So when A changes from 1 to 0 B & C remain constant whereas output changes from 1 to 0. Same applies for both B & C.
Normally they say the minimum number of steps to achieve MCDC is Number of Inputs + Number of Outputs.
So in your case MCDC can be achieved in minimum of 4 steps.

Related

How can I improve Insertionsort by the following argument ? The correct answer is b. Can someone CLEARLY explain every answer?

A person claims that they can improve InsertionSort by the following argument. In the innermost loop of InsertionSort, instead of looping over all entries in the already sorted array in order to insert the j’th observed element, simply perform BinarySearch in order to sandwich the j’th element in its correct position in the list A[1, ... , j−1]. This person claims that their resulting insertion sort is asymptotically as good as mergesort in the worst case scenario. True or False and why? Circle the one correct answer from the below:
a. True: In this version, the while loop will iterate log(n), but in each such iteration elements in the left side of the list have to be shifted to allow room for the key to propagate downwards across the median elements and so this shift will still require log(n) in the worst case scenario. Adding up, Insertion Sort will significantly improve in this case to continue to require n log(n) in the worst case scenario like mergesort.
b. False: In this version, the while loop will iterate log(n), but in each such iteration elements in the left side of the list have to be shifted to allow room for the key to propagate downwards and so this shift will still require n in the worst case scenario. Adding up, Insertion Sort will continue to require n² in the worst case scenario which is orders of magnitude worse than mergesort.
c. False: In this version, the while loop will iterate n, but in each such iteration elements in the left side of the list have to be shifted to allow room for the key to propagate downwards and so this shift will still require log(n) in the worst case scenario. Adding up, Insertion Sort will continue to require n log(n) in the worst case scenario which is orders of magnitude worse than mergesort.
d. True: In this version, the while loop will iterate log(n), but in each such iteration elements in the left side of the list have to be shifted to allow room for the key to propagate downwards and so this shift will still require n in the worst case scenario. Adding up, Insertion Sort will continue to require n log(n) in the worst case scenario which is orders of magnitude worse than mergesort.
b is correct, with some assumptions about compiler optimizations.
Consider a reverse sorted array,
8 7 6 5 4 3 2 1
and that insertion sort is half done so it is
5 6 7 8 4 3 2 1
The next step:
normal insertion sort sequence assuming most recent value read kept in register:
t = a[4] = 4 1 read
compare t and a[3] 1 read
a[4] = a[3] = 8 1 write
compare t and a[2] 1 read
a[3] = a[2] = 7 1 write
compare t and a[1] 1 read
a[2] = a[1] = 6 1 write
compare t and a[0] 1 read
a[1] = a[0] = 5 1 write
a[0] = t = 4 1 write
---------------
5 read 5 write
binary search
t = a[4] 1 read
compare t and a[1] 1 read
compare t and a[0] 1 read
a[4] = a[3] 1 read 1 write
a[3] = a[2] 1 read 1 write
a[2] = a[1] 1 read 1 write
a[1] = a[0] 1 read 1 write
a[0] = t 1 write
----------------
7 read 5 write
If a compiler re-read data with normal insertion sort it would be
9 read 5 write
In which case the binary search would save some time.
The expected answer to this question is b), but the explanation is not precise enough:
locating the position where to insert the j-th element indeed requires log(j) comparisons instead of j comparisons for regular Insertion Sort.
inserting the elements requires j element moves in the worst case for both implementations (reverse sorted array).
Summing these over the whole array produces:
n log(n) comparisons for this modified Insertion Sort idea in all cases vs: n2 comparisons in the worst case (already sorted array) for the classic implementation.
n2 element moves in the worst case in both implementations (reverse sorted array).
note that in the classic implementation the sum of the number of comparisons and element moves is constant.
Merge Sort on the other hand uses approximately n log(n) comparisons and n log(n) element moves in all cases.
Therefore the claim the resulting insertion sort is asymptotically as good as mergesort in the worst case scenario is False, indeed because the modified Insertion Sort method still performs n2 element moves in the worst case, which is asymptotically much worse than n log(n) moves.
Note however that depending on the relative cost of comparisons and element moves, the performance of this modified Insertion Sort approach may be much better than the classic implementation, for example sorting an array of string pointers containing URLs to the same site, the cost of comparing strings with a long initial substring is much greater than moving a single pointer.

Minimum of empty Seq is infinite, Why?

I'm working on this weeks PerlWChallenge.
You are given an array of integers #A. Write a script to create an
array that represents the smaller element to the left of each
corresponding index. If none found then use 0.
Here's my approach:
my #A = (7, 8, 3, 12, 10);
my $L = #A.elems - 1;
say gather for 1 .. $L -> $i { take #A[ 0..$i-1 ].grep( * < #A[$i] ).min };
Which kinda works and outputs:
(7 Inf 3 3)
The Infinity obviously comes from the empty grep. Checking:
> raku -e "().min.say"
Inf
But why is the minimum of an empty Seq Infinity? If anything it should be -Infinity. Or zero?
It's probably a good idea to test for the empty sequence anyway.
I ended up using
take .min with #A[ 0..$i-1 ].grep( * < #A[$i] ) or 0
or
take ( #A[ 0..$i-1 ].grep( * < #A[$i] ) or 0 ).min
Generally, Inf works out quite well in the face of further operations. For example, consider a case where we have a list of lists, and we want to find the minimum across all of them. We can do this:
my #a = [3,1,3], [], [-5,10];
say #a>>.min.min
And it will just work, since (1, Inf, -5).min comes out as -5. Were min to instead have -Inf as its value, then it'd get this wrong. It will also behave reasonably in comparisons, e.g. if #a.min > #b.min { }; by contrast, an undefined value will warn.
TL;DR say min displays Inf.
min is, or at least behaves like, a reduction.
Per the doc for reduction of a List:
When the list contains no elements, an exception is thrown, unless &with is an operator with a known identity value (e.g., the identity value of infix:<+> is 0).
Per the doc for min:
a comparison Callable can be specified with the named argument :by
by is min's spelling of with.
To easily see the "identity value" of an operator/function, call it without any arguments:
say min # Inf
Imo the underlying issue here is one of many unsolved wide challenges of documenting Raku. Perhaps comments here in this SO about doc would best focus on the narrow topic of solving the problem just for min (and maybe max and minmax).
I think, there is inspiration from
infimum
(the greatest lower bound). Let we have the set of integers (or real
numbers) and add there the greatest element Inf and the lowest -Inf.
Then infimum of the empty set (as the subset of the previous set) is the
greatest element Inf. (Every element satisfies that is smaller than
any element of the empty set and Inf is the greatest element that
satisfies this.) Minimum and infimum of any nonempty finite set of real
numbers are equal.
Similarly, min in Raku works as infimum for some Range.
1 ^.. 10
andthen .min; #1
but 1 is not from 1 ^.. 10, so 1 is not minimum, but it is infimum
of the range.
It is useful for some algorithm, see the answer by Jonathan
Worthington or
q{3 1 3
-2
--
-5 10
}.lines
andthen .map: *.comb( /'-'?\d+/ )».Int # (3, 1, 3), (-2,), (), (-5, 10)
andthen .map: *.min # 1,-2,Inf,-5
andthen .produce: &[min]
andthen .fmt: '%2d',',' # 1,-2,-2,-5
this (from the docs) makes sense to me
method min(Range:D:)
Returns the start point of the range.
say (1..5).min; # OUTPUT: «1␤»
say (1^..^5).min; # OUTPUT: «1␤»
and I think the infinimum idea is quite a good mnemonic for the excludes case which also could be 5.1^.. , 5.0001^.. etc.

Set partitioning

I'm trying to get a good grasp with this problem but I'm struggling.
Let's say that I have a S={1,2,3,4,5}, an L={(1,3,4),(2,3),(4,5),(1,3),(2),(5)} and an other tuple with the costs of L like C={10,20,12,15,4,10}
I want to make a constraint program in Prolog so as to take the solution that solves the problem with the minimum cost.(in this occasion it is the total sum of the costs of the subsets i will get)
My problem is that I cannot understand the way I'll make my modelisation. What I know is that I should choose a modelisation of binary variables {0,1} but I hardly understand how i will manage to express it via Prolog.
There is an easy way to do it: You can use Boolean indicators to denote which elements comprise a subset. For example, in your case:
subsets(Sets) :-
Sets = [[1,0,1,1,0]-10, % {1,3,4}
[0,1,1,0,0]-20, % {2,3}
[0,0,0,1,1]-12, % {4,5}
[1,0,1,0,0]-15, % {1,3}
[0,1,0,0,0]-4, % {2}
[0,0,0,0,1]-10]. % {5}
I now use SICStus Prolog and its Boolean constraint solver to express set covers:
:- use_module(library(lists)).
:- use_module(library(clpb)).
setcover(Cover, Cost) :-
subsets(Sets),
keys_and_values(Sets, Rows, Costs0),
transpose(Rows, Cols),
same_length(Rows, Coeffs),
maplist(cover(Coeffs), Cols),
labeling(Coeffs),
phrase(coeff_is_1(Coeffs, Rows), Cover),
phrase(coeff_is_1(Coeffs, Costs0), Costs),
sumlist(Costs, Cost).
cover(Coeffs, Col) :-
phrase(coeff_is_1(Col,Coeffs), Cs),
sat(card([1],Cs)).
coeff_is_1([], []) --> [].
coeff_is_1([1|Cs], [L|Ls]) --> [L], coeff_is_1(Cs, Ls).
coeff_is_1([0|Cs], [_|Ls]) --> coeff_is_1(Cs, Ls).
For each subset, a Boolean variable is used to denote whether that subset is part of the cover. Cardinality constraints make sure that each element is covered exactly once.
Example query and its result:
| ?- setcover(Cover, Cost).
Cover = [[0,0,0,1,1],[1,0,1,0,0],[0,1,0,0,0]],
Cost = 31 ? ;
Cover = [[1,0,1,1,0],[0,1,0,0,0],[0,0,0,0,1]],
Cost = 24 ? ;
no
I leave picking a cover with minimum cost as an easy exercise.
Maybe an explicit model for your problem instance makes things a bit clearer:
cover(SetsUsed, Cost) :-
SetsUsed = [A,B,C,D,E,F], % a Boolean for each set
SetsUsed #:: 0..1,
A + D #= 1, % use one set with element 1
B + E #= 1, % use one set with element 2
A + B + D #= 1, % use one set with element 3
A + C #= 1, % use one set with element 4
C + F #= 1, % use one set with element 5
Cost #= 10*A + 20*B + 12*C + 15*D + 4*E + 10*F.
You can solve this e.g. in ECLiPSe:
?- cover(SetsUsed,Cost), branch_and_bound:minimize(labeling(SetsUsed), Cost).
SetsUsed = [1, 0, 0, 0, 1, 1]
Cost = 24
Yes (0.00s cpu)

Approaches to converting a table of possibilities into logical statements

I'm not sure how to express this problem, so my apologies if it's already been addressed.
I have business rules summarized as a table of outputs given two inputs. For each of five possible value on one axis, and each of five values on another axis, there is a single output. There are ten distinct possibilities in these 25 cells, so it's not the case that each input pair has a unique output.
I have encoded these rules in TSQL with nested CASE statements, but it's hard to debug and modify. In C# I might use an array literal. I'm wondering if there's an academic topic which relates to converting logical rules to matrices and vice versa.
As an example, one could translate this trivial matrix:
A B C
-- -- -- --
X 1 1 0
Y 0 1 0
...into rules like so:
if B OR (A and X) then 1 else 0
...or, in verbose SQL:
CASE WHEN FieldABC = 'B' THEN 1
WHEN FieldABX = 'A' AND FieldXY = 'X' THEN 1
ELSE 0
I'm looking for a good approach for larger matrices, especially one I can use in SQL (MS SQL 2K8, if it matters). Any suggestions? Is there a term for this type of translation, with which I should search?
Sounds like a lookup into a 5x5 grid of data. The inputs on axis and the output in each cell:
Y=1 Y=2 Y=3 Y=4 Y=5
x=1 A A D B A
x=2 B A A B B
x=3 C B B B B
x=4 C C C D D
x=5 C C C C C
You can store this in a table of x,y,outvalue triplets and then just do a look up on that table.
SELECT OUTVALUE FROM BUSINESS_RULES WHERE X = #X and Y = #Y;

Treatment of error values in the SQL standard

I have a question about the SQL standard which I'm hoping a SQL language lawyer can help with.
Certain expressions just don't work. 62 / 0, for example. The SQL standard specifies quite a few ways in which expressions can go wrong in similar ways. Lots of languages deal with these expressions using special exceptional flow control, or bottom psuedo-values.
I have a table, t, with (only) two columns, x and y each of type int. I suspect it isn't relevant, but for definiteness let's say that (x,y) is the primary key of t. This table contains (only) the following values:
x y
7 2
3 0
4 1
26 5
31 0
9 3
What behavior is required by the SQL standard for SELECT expressions operating on this table which may involve division(s) by zero? Alternatively, if no one behavior is required, what behaviors are permitted?
For example, what behavior is required for the following select statements?
The easy one:
SELECT x, y, x / y AS quot
FROM t
A harder one:
SELECT x, y, x / y AS quot
FROM t
WHERE y != 0
An even harder one:
SELECT x, y, x / y AS quot
FROM t
WHERE x % 2 = 0
Would an implementation (say, one that failed to realize on a more complex version of this query that the restriction could be moved inside the extension) be permitted to produce a division by zero error in response to this query, because, say it attempted to divide 3 by 0 as part of the extension before performing the restriction and realizing that 3 % 2 = 1? This could become important if, for example, the extension was over a small table but the result--when joined with a large table and restricted on the basis of data in the large table--ended up restricting away all of the rows which would have required division by zero.
If t had millions of rows, and this last query were performed by a table scan, would an implementation be permitted to return the first several million results before discovering a division by zero near the end when encountering one even value of x with a zero value of y? Would it be required to buffer?
There are even worse cases, ponder this one, which depending on the semantics can ruin boolean short-circuiting or require four-valued boolean logic in restrictions:
SELECT x, y
FROM t
WHERE ((x / y) >= 2) AND ((x % 2) = 0)
If the table is large, this short-circuiting problem can get really crazy. Imagine the table had a million rows, one of which had a 0 divisor. What would the standard say is the semantics of:
SELECT CASE
WHEN EXISTS
(
SELECT x, y, x / y AS quot
FROM t
)
THEN 1
ELSE 0
END AS what_is_my_value
It seems like this value should probably be an error since it depends on the emptiness or non-emptiness of a result which is an error, but adopting those semantics would seem to prohibit the optimizer for short-circuiting the table scan here. Does this existence query require proving the existence of one non-bottoming row, or also the non-existence of a bottoming row?
I'd appreciate guidance here, because I can't seem to find the relevant part(s) of the specification.
All implementations of SQL that I've worked with treat a division by 0 as an immediate NaN or #INF. The division is supposed to be handled by the front end, not by the implementation itself. The query should not bottom out, but the result set needs to return NaN in this case. Therefore, it's returned at the same time as the result set, and no special warning or message is brought up to the user.
At any rate, to properly deal with this, use the following query:
select
x, y,
case y
when 0 then null
else x / y
end as quot
from
t
To answer your last question, this statement:
SELECT x, y, x / y AS quot
FROM t
Would return this:
x y quot
7 2 3.5
3 0 NaN
4 1 4
26 5 5.2
31 0 NaN
9 3 3
So, your exists would find all the rows in t, regardless of what their quotient was.
Additionally, I was reading over your question again and realized I hadn't discussed where clauses (for shame!). The where clause, or predicate, should always be applied before the columns are calculated.
Think about this query:
select x, y, x/y as quot from t where x%2 = 0
If we had a record (3,0), it applies the where condition, and checks if 3 % 2 = 0. It does not, so it doesn't include that record in the column calculations, and leaves it right where it is.