branch prediction, and optimized code

branch prediction, and optimized code - optimization

I have following set of code blocks, the purpose of the both block is same.
I had to implement the 2nd block to avoid inverse logic and to increase the readability.
BTW, in the production code the condition is very complex.
The question is - I know branching is bad, how much penalty I have to pay.
Just as an extra info, please also consider, the probability of else branch is very high.
X = Get_XValue()
if (X != 5)
{
K = X+3;
.
.
}
X = Get_XValue()
if (X == 5)
{
/*do nothing*/
}
else
{
K = X+3;
.
.
}

This all comes down to your compiler. A good optimizing compiler will detect that the then-clause in the second example is empty and reverse the test. Thus it will generate the same code for both cases so there should be no penalty at all.
As a side note, I can add that this was the case for all three compilers I tried (clang, gcc, and iccarm),

Related

What is the time complexity of below function?

I was reading book about competitive programming and was encountered to problem where we have to count all possible paths in the n*n matrix.
Now the conditions are :
`
1. All cells must be visited for once (cells must not be unvisited or visited more than once)
2. Path should start from (1,1) and end at (n,n)
3. Possible moves are right, left, up, down from current cell
4. You cannot go out of the grid
Now this my code for the problem :
typedef long long ll;
ll path_count(ll n,vector<vector<bool>>& done,ll r,ll c){
ll count=0;
done[r][c] = true;
if(r==(n-1) && c==(n-1)){
for(ll i=0;i<n;i++){
for(ll j=0;j<n;j++) if(!done[i][j]) {
done[r][c]=false;
return 0;
}
}
count++;
}
else {
if((r+1)<n && !done[r+1][c]) count+=path_count(n,done,r+1,c);
if((r-1)>=0 && !done[r-1][c]) count+=path_count(n,done,r-1,c);
if((c+1)<n && !done[r][c+1]) count+=path_count(n,done,r,c+1);
if((c-1)>=0 && !done[r][c-1]) count+=path_count(n,done,r,c-1);
}
done[r][c] = false;
return count;
}
Here if we define recurrence relation then it can be like: T(n) = 4T(n-1)+n2
Is this recurrence relation true? I don't think so because if we use masters theorem then it would give us result as O(4n*n2) and I don't think it can be of this order.
The reason, why I am telling, is this because when I use it for 7*7 matrix it takes around 110.09 seconds and I don't think for n=7 O(4n*n2) should take that much time.
If we calculate it for n=7 the approx instructions can be 47*77 = 802816 ~ 106. For such amount of instruction it should not take that much time. So here I conclude that my recurrene relation is false.
This code generates output as 111712 for 7 and it is same as the book's output. So code is right.
So what is the correct time complexity??

No, the complexity is not O(4^n * n^2).
Consider the 4^n in your notation. This means, going to a depth of at most n - or 7 in your case, and having 4 choices at each level. But this is not the case. In the 8th, level you still have multiple choices where to go next. In fact, you are branching until you find the path, which is of depth n^2.
So, a non tight bound will give us O(4^(n^2) * n^2). This bound however is far from being tight, as it assumes you have 4 valid choices from each of your recursive calls. This is not the case.
I am not sure how much tighter it can be, but a first attempt will drop it to O(3^(n^2) * n^2), since you cannot go from the node you came from. This bound is still far from optimal.

How to avoid usize going negative?

I'm translating a chunk (2000 lines) of proprietary C code into Rust. In C, it is common to run a pointer, array index, etc. down, for as long as it is non-negative. In Rust, simplified to the bone, it would look something like:
while i >= 0 && more_conditions {
more_work;
i -= 1;
}
Of course, when i is usize, you get an under-overflow from subtraction. I have learned to work around this by using for loops with .rev(), offsetting my indexes by one, or using a different type and casting with as usize, etc.
Usually it works, and usually I can make it legible, but the code I'm modifying is chock-full of indexes running towards each other, and eventually tested with i_low > i_high
Something like (in Rust)
loop {
while condition1(i_low) { i_low += 1; }
while condition2(i_high) { j_high -= 1; }
if i_low > i_high { return something; }
do_something_else;
}
Every now and then this panics, as i_high runs past 0.
I have been inserting a lot of j_high >= 0 && in the code, and it become a lot less readable.
How do experienced Rust programmers avoid usize variables going to -1?
for loops? for i in (0..size).rev()
casting? i as usize, after checking for i < 0
offsetting your variable by one, and using i-1 when safe?
extra conditionals?
catching exceptions?
Or do you just eventually learn to write programs around these situations?
Clarification: The C code is not broken - it has been supposedly in production for ten years, structuring video segments on multiple servers 24/7. It is just not following Rust conventions - it often returns -1 as an index, it recurses with -1 for the low index of an array to process, and indexes go negative all the time. All of these are handled before problems occurs - ugly, but functional. Something like:
incident_segment = detect_incident(array, start, end);
attach(array, incident_segment);
store(array, start, incident_segment - 1);
process(array, incident_segment + 1, end);
In the above code, every single of the three resulting calls may be getting a segment index that's -1 (attach, store) or out of bounds (process) It's handled, but after the call.
My Rust code appears to be working as well. As a matter of fact, in order to deal with the negative usize, I added additional logic that pruned a number of recursions, so it runs about as fast as the C code (apparently faster, but that's also because I distributed the output on multiple drives)
The issue is that the client does not not want a full rewrite, and wants the 'native' programmers to be able to check the two programs against each other. Based on the answers so far, I'm thinking that using i64 and casting/shadowing as needed may be the best way to produce code that's easy to read for the 'natives'. Which I personally do not have to like...

If you want to do it idiomatically:
for j in (0..=i).rev() {
if conditions {
break;
}
//use j as your new i here
}
Note the use of ..=i here in the iterator, this means that it'll actually iterate including i: [0, 1, 2, ..., i-1, i], otherwise, you end up with [0, 1, 2, ..., i-2, i-1]
Otherwise, here is the code:
while (i as isize - 1) != -2 && more_conditions {
more_work;
i -= 1;
}
playground

I'd probably start by using saturating_sub (and _add for parallel structure):
while condition1(i_low) { i_low = i_low.saturating_add(1); }
while condition2(i_high) { j_high = j_high.saturating_sub(1); }
You need to be careful to ensure that your logic handles the value saturating at zero. You could also use more C-like semantics with wrapping_sub.
Truthfully, there's no one-size-fits-all solution. Many times, complicated logic becomes simpler if you abstract it a bit, or turn it slightly sideways. You haven't provided any concrete examples, so we cannot give any useful advice. I solve way too many problems with iterators, so that's often my first solution.
catching exceptions
Absolutely not. That's exceedingly inefficient and non-idiomatic.

SMO algorithm running into infinite loop?

I'm interest in building an SVM multi class classifier, so I am currently implementing
Sequential minimal optimization SMO.
My implementation is based on the pseudo code in
`Fast Training of Support Vector Machines using Sequential Minimal Optimization" by John C. Platt
I observed that for certain training examples. The Smo may diverge and run into an infinite loop
The following loop in the main routine
numChanged = 0;
examineAll = 1;
while (numChanged > 0 || examineAll >0) {…}
may run into an infinite loop.
Is there there clue or criterion to prevent the smo algorithm routine from running into an infinite loop?
I would like to thank you for your answer.
Regards

You can add a max iteration condition if you want:
while ((numChanged > 0 || examineAll) && iter < MaxIter)
but for most cases it shouldn't run into an infinite loop, this is the full Platt's pseudocode:
while (numChanged > 0 || examineAll)
{
numChanged = 0;
// Adding curly brackets for better readability
if (examineAll)
{
loop I over all training examples
numChanged += examineExample(I);
}
else
{
loop I over examples where alpha is not 0 & not C
numChanged += examineExample(I);
}
if (examineAll == 1)
{
examineAll = 0;
}
else
{
examineAll = 1;
}
}
Notice that what it is doing is performing an iteration to examine the example and the next one do the same just to those examples where alpha is not 0 or C. If nothing changes after the "examine all" iteration, the while loop condition will be false hence stopping the loop.
So, for that to be in a infinite loop there must be a corner case (probably a numerical error) that introduce oscillations making examples to change during the examine all phase but not changing in the "examine only alpha == 0 and C".
Usually if the data is normalized in [-1,1] or [0,1] and the parameters of the algorithm have reasonable values, those corner cases would be rare. In any case, if you want to be extra careful you can put the max-iter safety net.

Make interpreter execute faster

I've created an interprter for a simple language. It is AST based (to be more exact, an irregular heterogeneous AST) with visitors executing and evaluating nodes. However I've noticed that it is extremely slow compared to "real" interpreters. For testing I've ran this code:
i = 3
j = 3
has = false
while i < 10000
j = 3
has = false
while j <= i / 2
if i % j == 0 then
has = true
end
j = j+2
end
if has == false then
puts i
end
i = i+2
end
In both ruby and my interpreter (just finding primes primitively). Ruby finished under 0.63 second, and my interpreter was over 15 seconds.
I develop the interpreter in C++ and in Visual Studio, so I've used the profiler to see what takes the most time: the evaluation methods.
50% of the execution time was to call the abstract evaluation method, which then casts the passed expression and calls the proper eval method. Something like this:
Value * eval (Exp * exp)
{
switch (exp->type)
{
case EXP_ADDITION:
eval ((AdditionExp*) exp);
break;
...
}
}
I could put the eval methods into the Exp nodes themselves, but I want to keep the nodes clean (Terence Parr saied something about reusability in his book).
Also at evaluation I always reconstruct the Value object, which stores the result of the evaluated expression. Actually Value is abstract, and it has derived value classes for different types (That's why I work with pointers, to avoid object slicing at returning). I think this could be another reason of slowness.
How could I make my interpreter as optimized as possible? Should I create bytecodes out of the AST and then interpret bytecodes instead? (As far as I know, they could be much faster)
Here is the source if it helps understanding my problem: src
Note: I haven't done any error handling yet, so an illegal statement or an error will simply freeze the program. (Also sorry for the stupid "error messages" :))
The syntax is pretty simple, the currently executed file is in OTZ1core/testfiles/test.txt (which is the prime finder).
I appreciate any help I can get, I'm really beginner at compilers and interpreters.

One possibility for a speed-up would be to use a function table instead of the switch with dynamic retyping. Your call to the typed-eval is going through at least one, and possibly several, levels of indirection. If you distinguish the typed functions instead by name and give them identical signatures, then pointers to the various functions can be packed into an array and indexed by the type member.
value (*evaltab[])(Exp *) = { // the order of functions must match
Exp_Add, // the order type values
//...
};
Then the whole switch becomes:
evaltab[exp->type](exp);
1 indirection, 1 function call. Fast.

Practice of checking 'trueness' or 'equality' in conditional statements - does it really make sense?

I remember many years back, when I was in school, one of my computer science teachers taught us that it was better to check for 'trueness' or 'equality' of a condition and not the negative stuff like 'inequality'.
Let me elaborate - If a piece of conditional code can be written by checking whether an expression is true or false, we should check the 'trueness'.
Example: Finding out whether a number is odd - it can be done in two ways:
if ( num % 2 != 0 )
{
// Number is odd
}
or
if ( num % 2 == 1 )
{
// Number is odd
}
(Please refer to the marked answer for a better example.)
When I was beginning to code, I knew that num % 2 == 0 implies the number is even, so I just put a ! there to check if it is odd. But he was like 'Don't check NOT conditions. Have the practice of checking the 'trueness' or 'equality' of conditions whenever possible.' And he recommended that I use the second piece of code.
I am not for or against either but I just wanted to know - what difference does it make? Please don't reply 'Technically the output will be the same' - we ALL know that. Is it a general programming practice or is it his own programming practice that he is preaching to others?
NOTE: I used C#/C++ style syntax for no reason. My question is equally applicable when using the IsNot, <> operators in VB etc. So readability of the '!' operator is just one of the issues. Not THE issue.

The problem occurs when, later in the project, more conditions are added - one of the projects I'm currently working on has steadily collected conditions over time (and then some of those conditions were moved into struts tags, then some to JSTL...) - one negative isn't hard to read, but 5+ is a nightmare, especially when someone decides to reorganize and negate the whole thing. Maybe on a new project, you'll write:
if (authorityLvl!=Admin){
doA();
}else{
doB();
}
Check back in a month, and it's become this:
if (!(authorityLvl!=Admin && authorityLvl!=Manager)){
doB();
}else{
doA();
}
Still pretty simple, but it takes another second.
Now give it another 5 to 10 years to rot.
(x%2!=0) certainly isn't a problem, but perhaps the best way to avoid the above scenario is to teach students not to use negative conditions as a general rule, in the hopes that they'll use some judgement before they do - because just saying that it could become a maintenance problem probably won't be enough motivation.
As an addendum, a better way to write the code would be:
userHasAuthority = (authorityLvl==Admin);
if (userHasAuthority){
doB();
else{
doA();
}
Now future coders are more likely to just add "|| authorityLvl==Manager", userHasAuthority is easier to move into a method, and even if the conditional is reorganized, it will only have one negative. Moreover, no one will add a security hole to the application by making a mistake while applying De Morgan's Law.

I will disagree with your old professor - checking for a NOT condition is fine as long as you are checking for a specific NOT condition. It actually meets his criteria: you would be checking that it is TRUE that a value is NOT something.
I grok what he means though - mostly the true condition(s) will be orders of magnitude smaller in quantity than the NOT conditions, therefore easier to test for as you are checking a smaller set of values.

I've had people tell me that it's to do with how "visible" the ping (!) character is when skim reading.
If someone habitually "skim reads" code - perhaps because they feel their regular reading speed is too slow - then the ! can be easily missed, giving them a critical mis-understanding of the code.
On the other hand, if a someone actually reads all of the code all of the time, then there is no issue.
Two very good developers I've worked with (and respect highily) will each write == false instead of using ! for similar reasons.
The key factor in my mind is less to do with what works for you (or me!), and more with what works for the guy maintaining the code. If the code is never going to be seen or maintained by anyone else, follow your personal whim; if the code needs to be maintained by others, better to steer more towards the middle of the road. A minor (trivial!) compromise on your part now, might save someone else a week of debugging later on.
Update: On further consideration, I would suggest factoring out the condition as a separate predicate function would give still greater maintainability:
if (isOdd(num))
{
// Number is odd
}

You still have to be careful about things like this:
if ( num % 2 == 1 )
{
// Number is odd
}
If num is negative and odd then depending on the language or implementation num % 2 could equal -1. On that note, there is nothing wrong with checking for the falseness if it simplifies at least the syntax of the check. Also, using != is more clear to me than just !-ing the whole thing as the ! may blend in with the parenthesis.
To only check the trueness you would have to do:
if ( num % 2 == 1 || num % 2 == -1 )
{
// Number is odd
}
That is just an example obviously. The point is that if using a negation allows for fewer checks or makes the syntax of the checks clear then that is clearly the way to go (as with the above example). Locking yourself into checking for trueness does not suddenly make your conditional more readable.

I remember hearing the same thing in my classes as well. I think it's more important to always use the more intuitive comparison, rather than always checking for the positive condition.

Really a very in-consequential issue. However, one negative to checking in this sense is that it only works for binary comparisons. If you were for example checking some property of a ternary numerical system you would be limited.

Replying to Bevan (it didn't fit in a comment):
You're right. !foo isn't always the same as foo == false. Let's see this example, in JavaScript:
var foo = true,
bar = false,
baz = null;
foo == false; // false
!foo; // false
bar == false; // true
!bar; // true
baz == false; // false (!)
!baz; // true

I also disagree with your teacher in this specific case. Maybe he was so attached to the generally good lesson to avoid negatives where a positive will do just fine, that he didn't see this tree for the forest.
Here's the problem. Today, you listen to him, and turn your code into:
// Print black stripe on odd numbers
int zebra(int num) {
if (num % 2 == 1) {
// Number is odd
printf("*****\n");
}
}
Next month, you look at it again and decide you don't like magic constants (maybe he teaches you this dislike too). So you change your code:
#define ZEBRA_PITCH 2
[snip pages and pages, these might even be in separate files - .h and .c]
// Print black stripe on non-multiples of ZEBRA_PITCH
int zebra(int num) {
if (num % ZEBRA_PITCH == 1) {
// Number is not a multiple of ZEBRA_PITCH
printf("*****\n");
}
}
and the world seems fine. Your output hasn't changed, and your regression testsuite passes.
But you're not done. You want to support mutant zebras, whose black stripes are thicker than their white stripes. You remember from months back that you originally coded it such that your code prints a black stripe wherever a white strip shouldn't be - on the not-even numbers. So all you have to do is to divide by, say, 3, instead of by 2, and you should be done. Right? Well:
#define DEFAULT_ZEBRA_PITCH 2
[snip pages and pages, these might even be in separate files - .h and .c]
// Print black stripe on non-multiples of pitch
int zebra(int num, int pitch) {
if (num % pitch == 1) {
// Number is odd
printf("*****\n");
}
}
Hey, what's this? You now have mostly-white zebras where you expected them to be mostly black!
The problem here is how think about numbers. Is a number "odd" because it isn't even, or because when dividing by 2, the remainder is 1? Sometimes your problem domain will suggest a preference for one, and in those cases I'd suggest you write your code to express that idiom, rather than fixating on simplistic rules such as "don't test for negations".

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

branch prediction, and optimized code - optimization

Related

What is the time complexity of below function?

How to avoid usize going negative?

SMO algorithm running into infinite loop?

Make interpreter execute faster

Practice of checking 'trueness' or 'equality' in conditional statements - does it really make sense?

Categories

Resources