Does "if ([bool] == true)" require one more step than "if ([bool])"? - optimization

This is a purely pedantic question, to sate my own curiosity.
I tend to go with the latter option in the question (so: if (boolCheck) { ... }), while a coworker always writes the former (if (boolCheck == true) { ... }). I always kind of teased him about it, and he always explained it as an old habit from when he was first starting programming.
But it just occurred to me today that actually writing out the whole == true part may in fact require an additional step for processing, since any expression with a == operator gets evaluated to a Boolean value. Is this true?
In other words, as I understand it, the option without the == true line could be loosely described as follows:
Check X
While the option with the == true line would be more like:
Let Y be true if X is true, otherwise false
Check Y
Am I correct? Or perhaps any normal compiler/interpreter will do away with this difference? Or am I overlooking something, and there's really no difference at all?
Obviously, there will be no difference in terms of actual observed performance. Like I said, I'm just curious.
EDIT: Thanks to everyone who actually posted compiled results to illustrate whether the steps were different between the two approaches. (It seems, most of the time, they were, albeit only slightly.)
I just want to reiterate that I was not asking about what is the "right" approach. I understand that many people favor one over the other. I also understand that, logically, the two are identical. I was just curious if the actual operations being performed by the CPU are exactly the same for both methods; as it turns out, much of the time (obviously it depends on language, compiler, etc.), they are not.

I think the comparison with true shows a lack of understanding by your partner. Bools should be named things to avoid this (such as isAvailable: if (isAvailable) {...}).

I would expect the difference to be optimised away by any half-decent compiler.
(I just checked with C# and the compiled code is exactly the same for both syntaxes.)

The compiler should generate the same code. However, comparing with true is arguably better because it is more explicit. Generally I don't do the explicit comparison, but you shouldn't make fun of him for doing it.
Edit: The easiest way to tell is to try. The MS compiler (cl.exe) generates the same number of steps in assembly:
int _tmain(int argc, _TCHAR* argv[])
{
bool test_me = true;
if (test_me) {
004113C2 movzx eax,byte ptr [test_me]
004113C6 test eax,eax
004113C8 je wmain+41h (4113E1h)
printf("test_me was true!");
}
if (test_me == true) {
004113E1 movzx eax,byte ptr [test_me]
004113E5 cmp eax,1
004113E8 jne wmain+61h (411401h)
printf("still true!");
}
return 0;
}
At this point the question is do test and cmp have the same cost? My guess is yes, though experts may be able to point out differences.
The practical upshot is you shouldn't worry about this. Chances are you have way bigger performance fish to fry.

Duplicate question (Should I use `!IsGood` or `IsGood == false`?). Here's a pointer to my previous answer:
The technique of testing specifically
against true or false is bad practice
if the variable in question is really
supposed to be used as a boolean value
(even if its type is not boolean) -
especially in C/C++. Testing against
true can (and probably will) lead to
subtle bugs.
See the following SO answer for details:
Should I use `!IsGood` or `IsGood == false`?
Here is a thread which details the reasoning behind why "== true" is often false in more explicit detail, including Stroustrup's explanation:
https://qt-project.org/forums/viewthread/32642

In my experience if (flag==true) is bad practice.
The first argument is academic:
If you have a bool flag, it is either true or false.
Now, the expression
(flag==true)
again, is true or false - it is no more expressive, only redundant - flag can't get "more true" or "more false" than it already is. It would be "clearer" only if it's not obvious flag is a boolean - but there's a standard way to fix that which works for all types: choose a better name.
Stretching this beyond reason, the following would be "even better":
((flag==true)==true)
The second argument is pragmatic and platform-specific
C and early C++ implementations had no real "bool" type, so there are different conventions for flags, the most common being anything nonzero is true. It is not uncommon for API's to return an integer-based BOOL type, but not enforce the return value to be 0 or 1.
Some environments use the following definitions:
#define FALSE 0
#define TRUE (!FALSE)
good luck with if ((1==1) == TRUE)
Further, some platforms use different values - e.g. the VARIANT_BOOL for VB interop is a short, and VARIANT_TRUE is -1.
When mixing libraries using these definitions, an explicit comparison to true can easily be an error disguised as good intentions. So, don't.

Here's the Python (2.6) disassembly:
>>> def check(x): return (bool(x) == True)
>>> import dis
>>> dis.dis(check)
1 0 LOAD_GLOBAL 0 (bool)
3 LOAD_FAST 0 (x)
6 CALL_FUNCTION 1
9 LOAD_GLOBAL 1 (True)
12 COMPARE_OP 2 (==)
15 RETURN_VALUE
>>> def check2(x): return bool(x)
>>> dis.dis(check2)
1 0 LOAD_GLOBAL 0 (bool)
3 LOAD_FAST 0 (x)
6 CALL_FUNCTION 1
9 RETURN_VALUE
I suppose the reason check isn't optimized is due to Python being a dynamic language. This in itself doesn't mean Python uses a poor compiler, but it could stand to do a little type inference here. Maybe it makes a difference when a .pyd file is created?

There may be some justification in the "boolCheck == true" syntax dependning on the name of the variable you're testing.
For example, if the variable name was "state", then you can either have:
if (state) {
...
}
or you can have
if (state == true) {
...
}
In this case, I think the latter form is clearer and more obvious.
Another example would be a variable like "enabled", in which case the first form is clearer.
Now having a boolean variable called "state" is bad practice, but you may have no control over that, in which case the "== true" syntax may improve code readability.

This gets into specific languages and what they consider "truthy" values. Here's two common ones: JavaScript and PHP.
The triple equals operator in both these languages assures that you really are checking for a boolean type of truth. For example, in PHP checking for if ($value) can be different from if ($value==true) or if ($value===true):
$f = true;
$z = 1;
$n = null;
$a = array(1,2);
print ($f) ?'true':'false'; // true
print ($f==true) ?'true':'false'; // true
print ($f===true) ?'true':'false'; // true
print "\n";
print ($z) ?'true':'false'; // true
print ($z==true) ?'true':'false'; // true
print ($z===true) ?'true':'false'; // false
print "\n";
print ($n) ?'true':'false'; // false
print ($n==true) ?'true':'false'; // false
print ($n===true) ?'true':'false'; // false
print "\n";
print ($a) ?'true':'false'; // true
print ($a==true) ?'true':'false'; // true
print ($a===true) ?'true':'false'; // false
print "\n";
print "\n";
I expect that languages one speaks daily will inform how one views this question.

The MSVC++ 6.0 compiler generates slightly different code for the two forms:
4: if ( ok ) {
00401033 mov eax,dword ptr [ebp-8]
00401036 and eax,0FFh
0040103B test eax,eax
0040103D je main+36h (00401046)
....
7: if ( ok == true ) {
00401046 mov ecx,dword ptr [ebp-8]
00401049 and ecx,0FFh
0040104F cmp ecx,1
00401052 jne main+4Bh (0040105b)
The former version should be very slightly faster, if I remember my 8086 timings correctly :-)

It depends on the compiler. There might be optimizations for either case.

Testing for equality to true can lead to a bug, at least in C: in C, 'if' can be used on integer types (not just on boolean types), and accept any non-zero value; however the symbol 'TRUE' is one specific non-zero value (e.g. 1). This can become important with bit-flags:
if (flags & DCDBIT)
{
//do something when the DCDBIT bit is set in the flags variable
}
So given a function like this ...
int isDcdSet()
{
return (flags & DCDBIT);
}
... the expression "if (isDcdSet())" is not the same as "if (isDcdSet() == TRUE)".
Anyway; I'd think that an optimizing compiler should optimize away any difference (because there is no logical difference), assuming it's a language with a true boolean (not just integer) type.

There is no logical difference, presuming there is no internal typecasting going on, e.g., testing fstream.
ifstream fin;
...
if( ! fin)
...

It may not be the same if you are working with a nullable bool.
Example:
Bool? myBool = true
if(myBool) // This will not compile
if(myBool == true) //this will compile

IMHO it's a bad idea to use the if (boolCheck == true) form, for a simple reason : if you accidentally type a single 'equals' sign instead of a double (i.e. if (boolCheck = true)), it will assign true to boolCheck and will always return true, which would obviously be a bug. Of course, most modern compilers will show a warning when they see that (at least the C# compiler does), but many developers just ignore warnings...
If you want to be explicit, you should prefer this form: if (true == boolCheck). This will avoid accidental assignation of the variable, and will cause a compile error if you forget an 'equals' sign.

Related

Variable getting overwritten in for loop

In a for loop, a different variable is assigned a value. The variable which has already been assigned a value is getting assigned the value from next iteration. At the end, both variable have the same value.
The code is for validating data in a file. When I print the values, it prints correct value for first iteration but in the next iteration, the value assigned in first iteration is changed.
When I print the value of $value3 and $value4 in the for loop, it shows null for $value4 and some value for $value3 but in the next iteration, the value of $value3 is overwritten by the value of $value4
I have tried on rakudo perl 6.c
my $fh= $!FileName.IO.open;
my $fileObject = FileValidation.new( file => $fh );
for (3,4).list {
put "Iteration: ", $_;
if ($_ == 4) {
$value4 := $fileObject.FileValidationFunction(%.ValidationRules{4}<ValidationFunction>, %.ValidationRules{4}<Arguments>);
}
if ($_ == 3) {
$value3 := $fileObject.FileValidationFunction(%.ValidationRules{3}<ValidationFunction>, %.ValidationRules{3}<Arguments>);
}
$fh.seek: SeekFromBeginning;
}
TL;DR It's not possible to confidently answer your question as it stands. This is a nanswer -- an answer in the sense I'm writing it as one but also quite possibly not an answer in the sense of helping you fix your problem.
Is it is rw? A first look.
The is rw trait on a routine or class attribute means it returns a container that contains a value rather than just returning a value.
If you then alias that container then you can get the behavior you've described.
For example:
my $foo;
sub bar is rw { $foo = rand }
my ($value3, $value4);
$value3 := bar;
.say for $value3, $value4;
$value4 := bar;
.say for $value3, $value4;
displays:
0.14168492246366005
(Any)
0.31843665763839857
0.31843665763839857
This isn't a bug in the language or compiler. It's just P6 code doing what it's supposed to do.
A longer version of the same thing
Perhaps the above is so far from your code it's disorienting. So here's the same thing wrapped in something like the code you provided.
spurt 'junk', 'junk';
class FileValidation {
has $.file;
has $!foo;
method FileValidationFunction ($,$) is rw { $!foo = rand }
}
class bar {
has $!FileName = 'junk';
has %.ValidationRules =
{ 3 => { ValidationFunction => {;}, Arguments => () },
4 => { ValidationFunction => {;}, Arguments => () } }
my ($value3, $value4);
method baz {
my $fh= $!FileName.IO.open;
my $fileObject = FileValidation.new( file => $fh );
my ($value3, $value4);
for (3,4).list {
put "Iteration: ", $_;
if ($_ == 4) {
$value4 := $fileObject.FileValidationFunction(
%.ValidationRules{4}<ValidationFunction>, %.ValidationRules{4}<Arguments>);
}
if ($_ == 3) {
$value3 := $fileObject.FileValidationFunction(
%.ValidationRules{3}<ValidationFunction>, %.ValidationRules{3}<Arguments>);
}
$fh.seek: SeekFromBeginning;
.say for $value3, $value4
}
}
}
bar.new.baz
This outputs:
Iteration: 3
0.5779679442816953
(Any)
Iteration: 4
0.8650280000277686
0.8650280000277686
Is it is rw? A second look.
Brad and I came up with essentially the same answer (at the same time; I was a minute ahead of Brad but who's counting? I mean besides me? :)) but Brad nicely nails the fix:
One way to avoid aliasing a container is to just use =.
(This is no doubt also why #ElizabethMattijsen++ asked about trying = instead of :=.)
You've commented that changing from := to = made no difference.
But presumably you didn't change from := to = throughout your entire codebase but rather just (the equivalent of) the two in the code you've shared.
So perhaps the problem can still be fixed by switching from := to =, but in some of your code elsewhere. (That said, don't just globally replace := with =. Instead, make sure you understand their difference and then change them as appropriate. You've got a test suite, right? ;))
How to move forward if you're still stuck
Right now your question has received several upvotes and no downvotes and you've got two answers (that point to the same problem).
But maybe our answers aren't good enough.
If so...
The addition of the reddit comment, and trying = instead of :=, and trying the latest compiler, and commenting on those things, leaves me glad I didn't downvote your question, but I haven't upvoted it yet and there's a reason for that. It's because your question is still missing a Minimal Reproducible Example.
You responded to my suggestion about producing an MRE with:
The problem is that I am not able to replicate this in a simpler environment
I presumed that's your situation, but as you can imagine, that means we can't confidently replicate it at all. That may be the way you prefer to go for reasons but it goes against SO guidance (in the link above) and if the current answers aren't adequate then the sensible way forward is for you to do what it takes to share code that reproduces your problem.
If it's large, please don't just paste it into your question but instead link to it. Perhaps you can set it up on glot.io using the + button to use multiple files (up to 6 I think, plus there's a standard input too). If not, perhaps gist it via, say, gist.github.com, and if I can I'll set it up on glot.io for you.
What is probably happening is that you are returning a container rather than a value, then aliasing the container to a variable.
class Foo {
has $.a is rw;
}
my $o = Foo.new( a => 1 );
my $old := $o.a;
say $old; # 1
$o.a = 2;
say $old; # 2
One way to avoid aliasing a container is to just use =.
my $old = $o.a;
say $old; # 1
$o.a = 2;
say $old; # 1
You could also decontainerize the value using either .self or .<>
my $old := $o.a.<>;
say $old; # 1
$o.a = 2;
say $old; # 1
(Note that .<> above could be .self or just <>.)

(Identifier) terms vs. constants vs. null signature routines

Identifier terms are defined in the documentation alongside constants, with pretty much the same use case, although terms compute their value in run time while constants get it in compile time. Potentially, that could make terms use global variables, but that's action at a distance and ugly, so I guess that's not their use case.
OTOH, they could be simply routines with null signature:
sub term:<þor> { "Is mighty" }
sub Þor { "Is mighty" }
say þor, Þor;
But you can already define routines with null signature. You can sabe, however, the error when you write:
say Þor ~ Þor;
Which would produce a many positionals passed; expected 0 arguments but got 1, unlike the term. That seems however a bit farfetched and you can save the trouble by just adding () at the end.
Another possible use case is defying the rules of normal identifiers
sub term:<✔> { True }
say ✔; # True
Are there any other use cases I have missed?
Making zero-argument subs work as terms will break the possibility to post-declare subs, since finding a sub after having parsed usages of it would require re-parsing of earlier code (which the perl 6 language refuses to do, "one-pass parsing" and all that) if the sub takes no arguments.
Terms are useful in combination with the ternary operator:
$ perl6 -e 'sub a() { "foo" }; say 42 ?? a !! 666'
===SORRY!=== Error while compiling -e
Your !! was gobbled by the expression in the middle; please parenthesize
$ perl6 -e 'sub term:<a> { "foo" }; say 42 ?? a !! 666'
foo
Constants are basically terms. So of course they are grouped together.
constant foo = 12;
say foo;
constant term:<bar> = 36;
say bar;
There is a slight difference because term:<…> works by modifying the parser. So it takes precedence.
constant fubar = 38;
constant term:<fubar> = 45;
say fubar; # 45
The above will print 45 regardless of which constant definition comes first.
Since term:<…> takes precedence the only way to get at the other value is to use ::<fubar> to directly access the symbol table.
say ::<fubar>; # 38
say ::<term:<fubar>>; # 45
There are two main use-cases for term:<…>.
One is to get a subroutine to be parsed similarly to a constant or sigilless variable.
sub fubar () { 'fubar'.comb.roll }
# say( fubar( prefix:<~>( 4 ) ) );
say fubar ~ 4; # ERROR
sub term:<fubar> () { 'fubar'.comb.roll }
# say( infix:<~>( fubar, 4 ) );
say fubar ~ 4;
The other is to have a constant or sigiless variable be something other than an a normal identifier.
my \✔ = True; # ERROR: Malformed my
my \term:<✔> = True;
say ✔;
Of course both use-cases can be combined.
sub term:<✔> () { True }
Perl 5 allows subroutines to have an empty prototype (different than a signature) which will alter how it gets parsed. The main purpose of prototypes in Perl 5 is to alter how the code gets parsed.
use v5;
sub fubar () { ord [split('','fubar')]->[rand 5] }
# say( fubar() + 4 );
say fubar + 4; # infix +
use v5;
sub fubar { ord [split('','fubar')]->[rand 5] }
# say( fubar( +4 ) );
say fubar + 4; # prefix +
Perl 6 doesn't use signatures the way Perl 5 uses prototypes. The main way to alter how Perl 6 parses code is by using the namespace.
use v6;
sub fubar ( $_ ) { .comb.roll }
sub term:<fubar> () { 'fubar'.comb.roll }
say fubar( 'zoo' ); # `z` or `o` (`o` is twice as likely)
say fubar; # `f` or `u` or `b` or `a` or `r`
sub prefix:<✔> ( $_ ) { "checked $_" }
say ✔ 'under the bed'; # checked under the bed
Note that Perl 5 doesn't really have constants, they are just subroutines with an empty prototype.
use v5;
use constant foo => 12;
use v5;
sub foo () { 12 } # ditto
(This became less true after 5.16)
As far as I know all of the other uses of prototypes have been superseded by design decisions in Perl 6.
use v5;
sub foo (&$) { $_[0]->($_[1]) }
say foo { 100 + $_[0] } 5; # 105;
That block is seen as a sub lambda because of the prototype of the foo subroutine.
use v6;
# sub foo ( &f, $v ) { f $v }
sub foo { #_[0].( #_[1] ) }
say foo { 100 + #_[0] }, 5; # 105
In Perl 6 a block is seen as a lambda if a term is expected. So there is no need to alter the parser with a feature like a prototype.
You are asking for exactly one use of prototypes to be brought back even though there is already a feature that covers that use-case.
Doing so would be a special-case. Part of the design ethos of Perl 6 is to limit the number of special-cases.
Other versions of Perl had a wide variety of special-cases, and it isn't always easy to remember them all.
Don't get me wrong; the special-cases in Perl 5 are useful, but Perl 6 has for the most part made them general-cases.

String comparison in the core language

Taking this simple comparison loopValue == "Firstname", is the following statement true?
If the internal operand inspecting the first char does not match the compared string, it will early abort
So taking the rawer form loopValue and "Firstname" are both []byte. And it would walk the array kind of like so as callback loop for truth:
someInspectionFunc(loopValue, "Firstname", func(charA, charB) {
return charA == charB
})
... making it keep on going until it bumps false and checks if the number of iterations was equal to both their lengths. Also does it check length first?
if len(loopValue) != len("Firstname") {
return false
}
I can't really find an explanation in the go source-code on GitHub as it's a bit above me.
The reason I'm asking this is because I'm doing big data processing and am benchmarking and doing cpu, memory and allocation pprof to squeeze some more juice out of the process. From that process it kind of made me think how Go (but also just C in general) would do this under the hood. Is this fully on an assembly level or does the comparison already happen in native Go code (kind of like sketched in the snippets above)?
Please let me know if I'm being too vague or if I missed something. Thank you
Update
When I did a firstCharater match in big strings of json, before really comparing I got about 3.7% benchmarking gain on 100k heavy entries:
<some irrelevant inspection code>.. v[0] == firstChar && v == lookFor {
// Match found when it reaches here
}
the code above (especially on long strings) is faster than just going for v == lookFor.
The function is handled in assembly. The amd64 version is:
TEXT runtime·eqstring(SB),NOSPLIT,$0-33
MOVQ s1str+0(FP), SI
MOVQ s2str+16(FP), DI
CMPQ SI, DI
JEQ eq
MOVQ s1len+8(FP), BX
LEAQ v+32(FP), AX
JMP runtime·memeqbody(SB)
eq:
MOVB $1, v+32(FP)
RET
And it's the compiler's job to ensure that the strings are of equal length before that is called. (The runtime·memeqbody function is actually where the optimized memory comparisons happen, but there's probably no need to post the full text here)
The equivalent Go code would be:
func eqstring_generic(s1, s2 string) bool {
if len(s1) != len(s2) {
return false
}
for i := 0; i < len(s1); i++ {
if s1[i] != s2[i] {
return false
}
}
return true
}

What is the point of coercions like Int(Cool)?

The Perl 6 Web site on functions says
Coercion types can help you to have a specific type inside a routine, but accept wider input. When the routine is called, the argument is automatically converted to the narrower type.
sub double(Int(Cool) $x) {
2 * $x
}
say double '21'; # 42
say double Any; # Type check failed in binding $x; expected 'Cool' but got 'Any'
Here the Int is the target type to which the argument will be coerced, and Cool is the type that the routine accepts as input.
But what is the point for the sub? Isn't $x just an Int? Why would you restrict the caller to implement Cool for the argument?
I'm doubly confused by the example because Int already is Cool. So I did an example where the types don't share a hierarchy:
class Foo { method foomethod { say 'foomethod' } }
class Bar {}
class Quux is Foo {
# class Quux { # compile error
method Bar { Bar.new }
}
sub foo(Bar(Foo) $c) {
say $c.WHAT; # (Bar)
# $c.foomethod # fails if uncommented: Method 'foomethod' not found for invocant of class 'Bar'
}
foo(Quux.new)
Here the invocant of foo is restricted to provide a Foo that can be converted to a Bar but foo cannot even call a method of Foo on $c because its type is Bar. So why would foo care that the to-be-coerced type is a Foo in the first place?
Could someone shed some light on this? Links to appropriate documentation and parts of the spec are appreciated as well. I couldn't find anything useful there.
Update Having reviewed this answer today I've concluded I had completely misunderstood what #musiKk was getting at. This was revealed most clearly in #darch's question and #musiKk's response:
#darch: Or is your question why one might prefer Int(Cool) over Int(Any)? If that's the case, that would be the question to ask.
#musiKk: That is exactly my question. :)
Reviewing the many other answers I see none have addressed it the way I now think it warrants addressing.
I might be wrong of course so what I've decided to do is leave the original question as is, in particular leaving the title as is, and leave this answer as it was, and instead write a new answer addressing #darch's reformulation.
Specify parameter type, with no coercion: Int $x
We could declare:
sub double (Int $x) { ... } # Accept only Int. (No coercion.)
Then this would work:
double(42);
But unfortunately typing 42 in response to this:
double(prompt('')); # `prompt` returns the string the user types
causes the double call to fail with Type check failed in binding $x; expected Int but got Str ("42") because 42, while looking like a number, is technically a string of type Str, and we've asked for no coercion.
Specify parameter type, with blanket coercion: Int() $x
We can introduce blanket coercion of Any value in the sub's signature:
sub double (Int(Any) $x) { ... } # Take Any value. Coerce to an Int.
Or:
sub double (Int() $x) { ... } # Same -- `Int()` coerces from Any.
Now, if you type 42 when prompted by the double(prompt('')); statement, the run-time type-check failure no longer applies and instead the run-time attempts to coerce the string to an Int. If the user types a well-formed number the code just works. If they type 123abc the coercion will fail at run-time with a nice error message:
Cannot convert string to number: trailing characters after number in '123⏏abc'
One problem with blanket coercion of Any value is that code like this:
class City { ... } # City has no Int coercion
my City $city;
double($city);
fails at run-time with the message: "Method 'Int' not found for invocant of class 'City'".
Specify parameter type, with coercion from Cool values: Int(Cool) $x
We can choose a point of balance between no coercion and blanket coercion of Any value.
The best class to coerce from is often the Cool class, because Cool values are guaranteed to either coerce nicely to other basic types or generate a nice error message:
# Accept argument of type Cool or a subclass and coerce to Int:
sub double (Int(Cool) $x) { ... }
With this definition, the following:
double(42);
double(prompt(''));
works as nicely as it can, and:
double($city);
fails with "Type check failed in binding $x; expected Cool but got City (City)" which is arguably a little better diagnostically for the programmer than "Method 'Int' not found for invocant of class 'City'".
why would foo care that the to-be-coerced type is a Foo in the first place?
Hopefully it's now obvious that the only reason it's worth limiting the coerce-from-type to Foo is because that's a type expected to successfully coerce to a Bar value (or, perhaps, fail with a friendly message).
Could someone shed some light on this? Links to appropriate documentation and parts of the spec are appreciated as well. I couldn't find anything useful there.
The document you originally quoted is pretty much all there is for enduser doc. Hopefully it makes sense now and you're all set. If not please comment and we'll go from there.
What this does is accept a value that is a subtype of Cool, and tries to transform it into an Int. At that point it is an Int no matter what it was before.
So
sub double ( Int(Cool) $n ) { $n * 2 }
can really be thought of as ( I think this is how it was actually implemented in Rakudo )
# Int is a subtype of Cool otherwise it would be Any or Mu
proto sub double ( Cool $n ) {*}
# this has the interior parts that you write
multi sub double ( Int $n ) { $n * 2 }
# this is what the compiler writes for you
multi sub double ( Cool $n ) {
# calls the other multi since it is now an Int
samewith Int($n);
}
So this accepts any of Int, Str, Rat, FatRat, Num, Array, Hash, etc. and tries to convert it into an Int before calling &infix:<*> with it, and 2.
say double ' 5 '; # 25
say double 2.5; # 4
say double [0,0,0]; # 6
say double { a => 0, b => 0 }; # 4
You might restrict it to a Cool instead of Any as all Cool values are essentially required to provide a coercion to Int.
( :( Int(Any) $ ) can be shortened to just :( Int() $ ) )
The reason you might do this is that you need it to be an Int inside the sub because you are calling other code that does different things with different types.
sub example ( Int(Cool) $n ) returns Int {
other-multi( $n ) * $n;
}
multi sub other-multi ( Int $ ) { 10 }
multi sub other-multi ( Any $ ) { 1 }
say example 5; # 50
say example 4.5; # 40
In this particular case you could have written it as one of these
sub example ( Cool $n ) returns Int {
other-multi( Int($n) ) * Int($n);
}
sub example ( Cool $n ) returns Int {
my $temp = Int($n);
other-multi( $temp ) * $temp;
}
sub example ( Cool $n is copy ) returns Int {
$n = Int($n);
other-multi( $n ) * $n;
}
None of them are as clear as the one that uses the signature to coerce it for you.
Normally for such a simple function you can use one of these and it will probably do what you want.
my &double = * * 2; # WhateverCode
my &double = * × 2; # ditto
my &double = { $_ * 2 }; # bare block
my &double = { $^n * 2 }; # block with positional placeholder
my &double = -> $n { $n * 2 }; # pointy block
my &double = sub ( $n ) { $n * 2 } # anon sub
my &double = anon sub double ( $n ) { $n * 2 } # anon sub with name
my &double = &infix:<*>.assuming(*,2); # curried
my &double = &infix:<*>.assuming(2);
sub double ( $n ) { $n * 2 } # same as :( Any $n )
Am I missing something? I'm not a Perl 6 expert, but it appears the syntax allows one to specify independently both what input types are permissible and how the input will be presented to the function.
Restricting the allowable input is useful because it means the code will result in an error, rather than a silent (useless) type conversion when the function is called with a nonsensical parameter.
I don't think an example where the two types are not in a hierarchical relationship makes sense.
Per comments on the original question, a better version of #musiKk's question "What is the point of coercions like Int(Cool)?" turned out to be:
Why might one prefer Int(Cool) over Int(Any)?
A corollary, which I'll also address in this answer, is:
Why might one prefer Int(Any) over Int(Cool)?
First, a list of various related options:
sub _Int_strong (Int $) {} # Argument must be Int
sub _Int_cool (Int(Cool) $) {} # Argument must be Cool; Int invoked
sub _Int_weak (Int(Any) $) {} # Argument must be Any; Int invoked
sub _Int_weak2 (Int() $) {} # same
sub _Any (Any $) {} # Argument must be Any
sub _Any2 ( $) {} # same
sub _Mu (Mu $) {} # Weakest typing - just memory safe (Mu)
_Int_strong val; # Fails to bind if val is not an Int
_Int_cool val; # Fails to bind if val is not Cool. Int invoked.
_Int_weak val; # Fails to bind if val is not Any. Int invoked.
_Any val; # Fails to bind if val is Mu
_Mu val; # Will always bind. If val is a native value, boxes it.
Why might one prefer Int(Cool) over Int(Any)?
Because Int(Cool) is slightly stronger typing. The argument must be of type Cool rather than the broader Any and:
Static analysis will reject binding code written to pass an argument that isn't Cool to a routine whose corresponding parameter has the type constraint Int(Cool). If static analysis shows there is no other routine candidate able to accept the call then the compiler will reject it at compile time. This is one of the meanings of "strong typing" explained in the last section of this answer.
If a value is Cool then it is guaranteed to have a well behaved .Int conversion method. So it will not yield a Method not found error at run-time and can be relied on to provide a good error message if it fails to produce a converted to integer value.
Why might one prefer Int(Any) over Int(Cool)?
Because Int(Any) is slightly weaker typing in that the argument can be of any regular type and P6 will just try and make it work:
.Int will be called on an argument that's passed to a routine whose corresponding parameter has the type constraint Int(...) no matter what the ... is. Provided the passed argument has an .Int method the call and subsequent conversion has a chance of succeeding.
If the .Int fails then the error message will be whatever the .Int method produces. If the argument is actually Cool then the .Int method will produce a good error message if it fails to convert to an Int. Otherwise the .Int method is presumably not a built in one and the result will be pot luck.
Why Foo(Bar) in the first place?
And what's all this about weak and strong typing?
An Int(...) constraint on a function parameter is going to result in either:
A failure to type check; or
An.Int conversion of the corresponding argument that forces it to its integer value -- or fails, leaving the corresponding parameter containing a Failure.
Using Wikipedia definitions as they were at the time of writing this answer (2019) this type checking and attempted conversion will be:
strong typing in the sense that a type constraint like Int(...) is "use of programming language types in order to both capture invariants of the code, and ensure its correctness, and definitely exclude certain classes of programming errors";
Currently weak typing in Rakudo in the sense that Rakudo does not check the ... in Int(...) at compile time even though in theory it could. That is, sub double (Int $x) {}; double Date; yields a compile time error (Calling double(Date) will never work) whereas sub double (Int(Cool) $x) {}; double Date; yields a run time error (Type check failed in binding).
type conversion;
weak typing in the sense that it's implicit type conversion in the sense that the compiler will handle the .Int coercion as part of carrying out the call;
explicit type conversion in the sense that the Int(...) constraint is explicitly directing the compiler to do the conversion as part of binding a call;
checked explicit type conversion -- P6 only does type safe conversions/coercions.
I believe the answer is as simple as you may not want to restrict the argument to Int even though you will be treating it as Int within the sub. say for some reason you want to be able to multiply an Array by a Hash, but fail if the args can't be treated as Int (i.e. is not Cool).
my #a = 1,2,3;
my %h = 'a' => 1, 'b' => 2;
say #a.Int; # 3 (List types coerced to the equivalent of .elems when treated as Int)
say %h.Int; # 2
sub m1(Int $x, Int $y) {return $x * $y}
say m1(3,2); # 6
say m1(#a,%h); # does not match
sub m2(Int(Cool) $x, Int(Cool) $y) {return $x * $y}
say m2('3',2); # 6
say m2(#a,%h); # 6
say m2('foo',2); # does not match
of course, you could also do this without the signature because the math operation will coerce the type automatically:
sub m3($x,$y) {return $x * $y}
say m3(#a,%h); # 6
however, this defers your type check to the inside of the sub, which kind of defeats the purpose of a signature and prevents you from making the sub a multi
All subtypes of Cool will be (as Cool requires them to) coerced to an Int. So if an operator or routine internal to your sub only works with Int arguments, you don't have to add an extra statement/expression converting to an Int nor does that operator/routine's code need to account for other subtypes of Cool. It enforces that the argument will be an Int inside of your sub wherever you use it.
Your example is backwards:
class Foo { method foomethod { say 'foomethod' } }
class Bar {}
class Quux is Bar {
method Foo { Foo.new }
}
sub foo(Foo(Bar) $c) {
#= converts $c of type Bar to type Foo
#= returns result of foomethod
say $c.WHAT; #-> (Foo)
$c.foomethod #-> foomethod
}
foo(Quux.new)

Is comparing a BOOL against YES dangerous?

I found a comment today in a source file:
// - no longer compare BOOL against YES (dangerous!)
Is comparing BOOL against YES in Objective-C really that dangerous? And why is that?
Can the value of YES change during runtime? Maybe NO is always 0 but YES can be 1, 2 or 3 - depending on runtime, compiler, your linked frameworks?
The problem is that BOOL is not a native type, but a typedef:
typedef signed char BOOL;
#define YES (BOOL)1
#define NO (BOOL)0
As a char, its values aren't constrained to TRUE and FALSE. What happens with another value?
BOOL b = 42;
if (b)
{
// true
}
if (b != YES)
{
// also true
}
You should never compare booleans against anything in any of the C based languages. The right way to do it is to use either:
if (b)
or:
if (!b)
This makes your code much more readable (especially if you're using intelligently named variables and functions like isPrime(n) or childThreadHasFinished) and safe. The reason something like:
if (b == TRUE)
is not so safe is that there are actually a large number of values of b which will evaluate to true, and TRUE is only one of them.
Consider the following:
#define FALSE 0
#define TRUE 1
int flag = 7;
if (flag) printf ("number 1\n");
if (flag == TRUE) printf ("number 2\n");
You should get both those lines printed out if it were working as expected but you only get the first. That's because 7 is actually true if treated correctly (0 is false, everything else is true) but the explicit test for equality evaluates to false.
Update:
In response to your comment that you thought there'd be more to it than coder stupidity: yes, there is (but I still wouldn't discount coder stupidity as a good enough reason - defensive programming is always a good idea).
I also mentioned readability, which is rather high on my list of desirable features in code.
A condition should either be a comparison between objects or a flag (including boolean return values):
if (a == b) ...
if (c > d) ...
if (strcmp (e, "Urk") == 0) ...
if (isFinished) ...
if (userPressedEsc (ch)) ...
If you use (what I consider) an abomination like:
if (isFinished == TRUE) ...
where do you stop:
if (isFinished == TRUE) ...
if ((isFinished == TRUE) == TRUE) ...
if (((isFinished == TRUE) == TRUE) == TRUE) ...
and so on.
The right way to do it for readability is to just use appropriately named flag variables.
All this is true, but there are valid counter arguments that might be considered:
— Maybe we want to check a BOOL is actually YES or NO. Really, storing any other value than 0 or 1 in a BOOL is pretty incorrect. If it happens, isn't it more likely because of a bug somewhere else in the codebase, and isn't not explicitly checking against YES just masking this bug? I think this is way more likely than a sloppy programmer using BOOL in a non-standard way. So, I think I'd want my tests to fail if my BOOL isn't YES when I'm looking for truth.
— I don't necessarily agree that "if (isWhatever)" is more readable especially when evaluating long, but otherwise readable, function calls,
e.g. compare
if ([myObj doThisBigThingWithName:#"Name" andDate:[NSDate now]]) {}
with:
if (![myObj doThisBigThingWithName:#"Name" andDate:[NSDate now]]) {}
The first is comparing against true, the second against false and it's hard to tell the difference when quickly reading code, right?
Compare this to:
if ([myObj doThisBigThingWithName:#"Name" andDate:[NSDate now]] == YES) {}
and
if ([myObj doThisBigThingWithName:#"Name" andDate:[NSDate now]] == NO) {}
…and isn't it much more readable?
Again, I'm not saying one way is correct and the other's wrong, but there are some counterpoints.
When the code uses a BOOL variable, it is supposed to use such variable as a boolean. The compiler doesn't check if a BOOL variable gets a different value, in the same way the compiler doesn't check if you initialize a variable passed to a method with a value taken between a set of constants.