Basic vs. compound condition coverage - testing

I'm trying to get my head around the differences between these 2 coverage criteria and I can't work out how they differ. I think I'm failing to understand exactly what decision coverage is. My software testing textbook states that compound decision coverage can be costly (2n combinations for n basic conditions).
I would have thought basic condition coverage would be costlier.
Consider a && b && c && d && e. My understanding is that in basic condition coverage, each of these atomic variables have to have the value TRUE and FALSE in a test case for the test case to be have basic condition adequacy - that's 32 different test cases.
So what is the actual difference, and what is referred to as a "basic condition". In the example above, is a a basic condition?
Thanks.

Regarding terminology, I don't have a single source handy that uses the exact terms "basic condition coverage" and "multiple condition coverage". Binder's "Testing Object-Oriented Systems" says "condition coverage" and "multiple-condition coverage". Everett & McLeod's "Software Testing" says "simple condition coverage" and "compound condition coverage". But I'm certain that the first term in each case is your "basic condition coverage" and the second is your "compound condition coverage". I'll use those terms below.
Basic condition coverage means that every basic condition in the program is true in some test and false in some test, regardless of other conditions. In the following
if a && b && c
# do stuff
else
# do other stuff
end
there is a compound condition, a && b && c, with three basic conditions, a, b and c. It takes only two test cases, one where all basic conditions are true and one where all are false, to get full basic condition coverage. It doesn't matter that the basic conditions happen to be part of a compound condition.
Note that basic condition coverage is not branch coverage. If the compound condition were a && b && !c, those two test cases above would still achieve basic condition coverage but would not achieve branch coverage.
A less aggressively optimized set of test cases for basic condition coverage would have one test case where all three basic conditions are false and three test cases with a different basic condition true in each. That would still only be four of the eight possible combinations of basic conditions in the compound condition. The uncomfortable feeling that we're ignoring the other four is why there's compound condition coverage. That requires a test for each possible combination of basic conditions in a compound condition. In the example above, you'd need eight tests, one for each possible combination of possible values of a, b and c, to get full compound condition coverage.

First, the difference between Decision and Condition.
A Condition is an atomar boolean expression that can not be broken down into simpler boolean expression. For example: a (if a is boolean).
A Decision is a compound of Conditions with zero or more Boolean operators. A Decision without an operator is also a condition. For example: (a or b) and c but also a and b or just a.
Lets take a simple example
if(decision) {
//branch 1
} else {
//branch 2
}
You need two tests to cover both branches. Thats the decision coverage or branch coverage. In case the decision is a condition (i.e. just a), that is also called basic condition coverage, which is the coverage of the two branches of a single condition.
The decision can be broken down into conditions.
Lets take for example
decision = (a or b) and c
The decision coverage would be achieved with
a,b,c = 0
a,b,c = 1
But the permutation of all the combinations of its boolean sub expressions is the full condition coverage or multiple condition coverage), which is the compound of the basic condition coverage :
| a | b | c |
| 0 | 0 | 1 |
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 0 | 1 | 0 |
| 1 | 0 | 1 |
| 1 | 0 | 0 |
| 1 | 1 | 1 |
| 1 | 1 | 0 |
That would be quite a lot of tests, but some of those are redundant as some conditions are covered by others. This is reflected in the Modified Condition/Decision Coverage (MC/DC) which is a combination of condition coverage and function coverage.
For MC/DC it is required, that each condition has to affect the outcome independently. With the above test (all are 0 or all are 1), we ignore the fact, that c-value doesn't matter if a and b are 0, or, that b-value doesnt matter if a and c are 1.
So you should sit down, use your brain, and think about for which combinations the overall result R is 1 or 0.
| a | b | c | a or b | c | R | eq
1 | 0 | 0 | 0 | 0 | 0 | 0 | A
2 | 0 | 0 | 1 | 0 | 1 | 0 | B
3 | 0 | 1 | 0 | 1 | 0 | 0 | A
4 | 0 | 1 | 1 | 1 | 1 | 1 | C
5 | 1 | 0 | 0 | 1 | 0 | 0 | A
6 | 1 | 0 | 1 | 1 | 1 | 1 | D
7 | 1 | 1 | 0 | 1 | 0 | 0 | A
8 | 1 | 1 | 1 | 1 | 1 | 1 | D
The last column shows the equivalence class:
A: c = 0, result is 0, neither a nor b have an influence
B: a,b = 0, result is 0, c has no influence
C: b,c = 1, result is 1, a has no influence
D: a,c = 1, result is 1, b has no influence
For B and C it's quite obvious which to pick, not so for A and D. For each you have to check, what would happen, if I replace the operators, i.e. or -> and, and -> or, how will this affect the result of the (sub)decision. If the result will be affected, you got a candidate - If not, you don't.
A : (0 and/or 0) and/or 0 -> doesn't matter
A : (0 and 1) vs (0 or 1) -> does matter! -> Candidate
A : (1 and 0) vs (1 or 0) -> does matter! -> Candidate
A : (1 and/or 1) -> doesn't matter
D : (1 and 0) vs (1 or 0) -> does matter -> Candidate
D : (1 and 1) -> doesn't matter
So you get the final test set as mentioned above:
a = 0, b = 1, c = 0 -> false branch (A) OR a = 1, b = 0, c = 0
a = 0, b = 0, c = 1 -> false branch (B)
a = 0, b = 1, c = 1 -> true branch (C)
a = 1, b = 0, c = 1 -> true branch (D)
Especially the latter test - changing the operators - can be done with tools like mutation testing, that do not just replaing operators, but can do quite some more, i.e. flipping operands, removing statements, change order of execution, replace return values etc. And for each alteration of your code, it verifies if the test actually fails. This is good indicator of the quality of your test suite and ensures that code is not just covered but your tests for the code are actually valid.
Regarding the terminology, I couldn't find the term "Compound Decision Coverage" somewhere. In my view a "compound decision" would be a compound of compounds of conditions, in other words: a compound of conditions.

Related

Confusing matching behaviour of pandas extract(all)

I have a strange problem. But first, I want to match a hierarchy-based string onto the value of a column in a pandas data frame and count the occurrence of the current node and all of its children.
| index | hierarchystr |
| ----- | --------------------- |
| 0 | level0level00level000|
| 1 | level0level01 |
| 2 | level0level02level021|
| 3 | level0level02level021|
| 4 | level0level02level020|
| 5 | level0level02level021|
| 6 | level1level02level021|
| 7 | level1level02level021|
| 8 | level1level02level021|
| 9 | level2level02level021|
Assume that there are 300k lines. Each node can have multiple children with again multiple children so on and so forth (here represented by level0-2 strings). Now I have a separate hierarchy where I extract the hierarchy strings from. Now to the problem:
#hstrs = ["level0", "level1", "level0level01", "level0level02", "level0level02level021"]
pat = "|".join(hstrs)
s = df.hierarchystr.str.extract('(' + pat + ')', expand=True)[0]
df1 = df.groupby(s).size().reset_index(name='Count')
df1 = df1[df1 > 200]
size = len(df1)
The size of the found matched substrings with occurrence greater than 200 differ every RUN! "level0" should match every row where the hierarchy str level0 is included and should build a group with all its subchildren and that size needs to be greater than 200.
Edit:// levelX is just an example, i have thousands of nodes, with different names and again thousands of different subchilds. The hstrs strings do not include each other, besides the parent nodes. (E.g. "parent1" is included in "parent1subchild1" and "parent1subchild2")
I traced it back to a different order of the hierarchy strings in the array hstrs. So I changed the code and compare each substring individually:
for hstr in hstrs:
s = df.hierarchystr.str.extract('(' + hstr + ')', expand=True)
s2 = s.count()
s3 = s2.values[0]
if s3 > 200:
list.append(hstr)
This is slow as hell, but the result sticks the same, no matter which order hstrs has. But for efficiency is it possible to do the same with only one regex matching group, all at once for all hstrs?
Edit://
expected output would be:
|index| 0 | Count |
|-----|---------------------|-------|
|0 |level0 | 5 |
|1 |level1 | 3 |
|2 |level0level01 | 1 |
|3 |level0level02 | 4 |
|4 |level0level02level021| 3 |
Edit2://
it has something to do with the ordering of hstrs. I think with the match and stop after the first match the behavior of the extract method. If the ordering is different the hierarchy strings in the pat will be matched differently which results in different sizes of each group. A high hierarchy (short str) will be matched first, the lower hierarchy levels in the same pat won't be matched again. But IDK what to do against this behavior.
Edit3://
an alternative would be, but is also slow as hell:
for hstr in hstrs:
s = df[df.hierarchystr.str.contains(fqn)]
s2 = s.count()
s3 = s2.values[0]
if s3 > 200:
beforeset.append(fqn)
Edit4://
I think what I am searching for is the opportunity to do a "group_by" with "contains" or "is in" for the hstrs. I am glad for every Idea. :)
Edit5://
Found a simple, but not satisfying alternative (but faster than the previous tries):
containing =[item for hierarchystr in df.hierarchystr for item in hstrs if item in hierarchystr]
containing = Counter(containing)
df1 = pd.DataFrame([containing]).T
nodeNamesWithOver200 = df1[df1 > 200].dropna().index.values

What is the meaning of the "Load" column in Apache balancer-manager?

I've set up the Apache (2.4) load-balancer which is working okay. To monitor its performance, I enabled the balancer-manager handler, which shows the status of the balancers.
I noticed a "Load" column, which was not present in version 2.2, with a value that may be negative, but I don't understand its meaning nor I was able to find documentation relative to this.
Can anyone explain the meaning of that value or point me to the right documentation?
I now understood, how the calculation of "Load" works. Here is a I think more simpler example than on the apache documents page.
Let's say we have 3 worker and a configured load factor of 1.
1) Start
a | b | c
--+---+---
0 | 0 | 0
add the load factor of 1 to all workers
a | b | c
--+---+---
1 | 1 | 1
now select the one with highest value --> a and decrease by the sum of the factor of all (=3) - this is the selected worker
a | b | c
---+---+---
-2 | 1 | 1
2) next round, add again 1 to all
a | b | c
---+---+---
-1 | 2 | 2
now select the one with highest value --> b and decrease by the sum of the factor of all (=3) - this is the selected worker
a | b | c
---+----+----
-1 | -1 | 2
3) next round, add again 1
a | b | c
---+----+----
0 | 0 | 3
now select the one with highest value --> c and decrease by the sum of the factor of all (=3) - this is the selected worker
a | b | c
---+----+----
0 | 0 | 0
startover again :)
I hope this helps others.
The Load value is populated by lbstatus based on this line of code:
ap_rprintf(r, "<td>%d</td><td>", worker->s->lbstatus);
in https://svn.apache.org/viewvc/httpd/httpd/trunk/modules/proxy/mod_proxy_balancer.c?view=markup#l1767 (line might changed when the code modified)
Since your method is by request, lbstatus is specified by mod_lbmethod_byrequests which define:
lbstatus is how urgent this worker has to work to fulfill its quota of
work.
Details on the algorithm can be found here: https://httpd.apache.org/docs/2.4/mod/mod_lbmethod_byrequests.html
i too want to know to description for others column like BUSY, ELECTED etc.. my LB has BUSY over 100 already.. i though BUSY should not exceed 100 ( as in 100% server busyness or something )

Generate a variable for an entire household based on another variable for cross sectional data

I have a dataset on various households. Each household has various individuals. I want to take one entry(individual) for a specific variable and apply it to the entire household. Finally I want to perform regression on the household data and ignore the individuals. Eg:
Household id | Individual id | Variable 1
H1 | A1 |a
H1 | A2 | .
H1 | A3 | a
H1 | A4 | a
H2 | A1 | B
H2 | A2 | .
H3 | A1 | B
H3 | A2 | a
H3 | A3 | .
H3 | A4 | a
H3 | A5 | .
H3 | A6 | .
I want to create a data where :
Household id Variable 1
H1 | a
H2 | B
H3 | B
[EDIT : it is possible to do it through egen. I was trying to approach it through nested loops.
> foreach h in household_id{
> local d = 0
> local e = 0
> foreach i in individual_id{
> replace `d' = 1 if var1 == 1
> replace `e' = 2 if var1 == 2
> }
> foreach i in individual_id{
> replace a = 1 if `d' == 1
> replace a = 2 if `e' == 2
> }
> }
Here a is another variable which I will directly use to replace var1. The code gives an error :
0 invalid name
What are the corrections required in my code? Is there a fundamental flaw in my usage of local macros? Using tempvar didn't help either.
The code shown is very confused.
A loop over a single entity that is never referred to inside the loop is legal but just is equivalent to the statements inside, executed once.
Notice that you never refer to the local macros h or i within your loops.
Thus your code is equivalent to
local d = 0
local e = 0
replace `d' = 1 if var1 == 1
replace `e' = 2 if var1 == 2
replace a = 1 if `d' == 1
replace a = 2 if `e' == 2
That fails when the local macro d is first replaced by its contents, which results in
replace 0 = 1 if var1 == 1
and the error message is correct because 0 is not a valid variable name. The next statement would raise the same problem.
Stata never gets as far as testing
... if 0 == 1
... if 0 == 2
and those statements, although legal, would do nothing because the conditions will never be satisfied.
So far, so negative. I can't suggest code positively because I don't understand what you want to do. What you have posted doesn't look like real or realistic data. I suggest that you look at http://www.statalist.org/forums/help#stata which suggests how to post Stata data examples (and so is relevant here) and also at https://stackoverflow.com/help/mcve (which does lay down the standard here).

Multiple Font Style Combinations in vb.net

If I want to create a font with multiple style combinations, like bold AND underline, I have to place the 'or' statement between it, like in the example below:
lblArt.Font = New Font("Tahoma", 18, FontStyle.Bold Or FontStyle.Underline)
If you place bold 'and' underline, it won't work, and you only get 1 of the 2 (like how the or statement should be working), while that would be the logically way to do it. What is the reason behind this?
Boolean logic works a bit differently than the way we use the terms in English. What's happening here is that the enumerated FontStyle values are actually bit flags, and in order to manipulate bit flags, you use bitwise operations.
To combine two bit flags, you OR them together. An OR operation combines the two values. So imagine that FontStyle.Bold was 2 and FontStyle.Underline was 4. When you OR them together, you get 6—you've combined them together. In Boolean logic, you can think of an OR operation as returning "true" (i.e., setting that bit in the result) if either of the bits in the two operands are set, and "false" if neither of the bits in the two operands are set.
You can write a truth table for such an operation as follows:
| A | B | A OR B |
|---|---|--------|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 1 |
Notice that the results more closely mirror what we, in informal English, would call "and". If either one has it set, then the result has it set, too.
In contrast to OR, a bitwise AND operation only returns "true" (i.e., sets that bit in the result) if both of the bits in the two operands are set. Otherwise, the result is "false". Again, a truth table can be written:
| A | B | A AND B |
|---|---|---------|
| 0 | 0 | 0 |
| 0 | 1 | 0 |
| 1 | 0 | 0 |
| 1 | 1 | 1 |
Assuming again that FontStyle.Bold has the value 2 and FontStyle.Underline has the value 4, if you AND them together, you get 0. This is because the values effectively cancel each other out. The net result is that you don't get any font styles—precisely why it doesn't work when you write FontStyle.Bold And FontStyle.Underline.
In VB, a bitwise OR operation is performed using the Or operator. The And operator performs a bitwise AND operation. So in order to do a bitwise inclusion of values, which is how you combine bit flags, you use the Or operator.
try this:
lblArt.Font = New Drawing.Font("Tahoma", _
18, _
FontStyle.Bold or FontStyle.Italic)
use "New Drawing.Font" instead of Font alone
Source

Bitwise operator predence

A gotcha I've run into a few times in C-like languages is this:
original | included & ~excluded // BAD
Due to precedence, this parses as:
original | (included & ~excluded) // '~excluded' has no effect
Does anyone know what was behind the original design decision of three separate precedence levels for bitwise operators? More importantly, do you agree with the decision, and why?
The operators have had this precedence since at least C.
I agree with the order because it is the same relative order as the relative order of the arithmetic operators that they are most similar to (+, * and negation).
You can see the similarity of & vs *, and | vs + here:
A B | A&B A*B | A|B A+B
0 0 | 0 0 | 0 0
0 1 | 0 0 | 1 1
1 0 | 0 0 | 1 1
1 1 | 1 1 | 1 2
The similarity of bitwise not and negation can be seen by this formula:
~A = -A - 1
To extend Mark Byers' answer, in boolean algebra (used extensively by electrical engineers to simplify logic circuits to the minimum number of gates and to avoid race conditions), the tradition is that bitwise AND takes precedent over bitwise OR. C was just following this established tradition. See http://en.wikiversity.org/wiki/Boolean_algebra#Combining_Operations :
Just as in ordinary algebra, where
multiplication takes priority over
addition, AND takes priority (or
precedence) over OR.