Differences between .Bool, .so, ? and so - raku

I’m trying to figure out what the differences are between the above-mentioned routines, and if statements like
say $y.Bool;
say $y.so;
say ? $y;
say so $y;
would ever produce a different result.
So far the only difference that is apparent to me is that ? has a higher precedence than so. .Bool and .so seem to be completely synonymous. Is that correct and (practically speaking) the full story?

What I've done to answer your question is to spelunk the Rakudo compiler source code.
As you note, one aspect that differs between the prefixes is parsing differences. The variations have different precedences and so is alphabetic whereas ? is punctuation. To see the precise code controlling this parsing, view Rakudo's Grammar.nqp and search within that page for prefix:sym<...> where the ... is ?, so, etc. It looks like ternary (... ?? ... !! ...) turns into an if. I see that none of these tokens have correspondingly named Actions.pm6 methods. As a somewhat wild guess perhaps the code generation that corresponds to them is handled by this part of method EXPR. (Anyone know, or care to follow the instructions in this blog post to find out?)
The definitions in Bool.pm6 and Mu.pm6 show that:
In Mu.pm6 the method .Bool returns False for an undefined object and .defined otherwise. In turn .defined returns False for an undefined object and True otherwise. So these are the default.
.defined is documented as overridden in two built in classes and .Bool in 19.
so, .so, and ? all call the same code that defers to Bool / .Bool. In theory classes/modules could override these instead of, or as well, as overriding .Bool or .defined, but I can't see why anyone would ever do that either in the built in classes/modules or userland ones.
not and ! are the same (except that use of ! with :exists dies) and both turn into calls to nqp::hllbool(nqp::not_i(nqp::istrue(...))). I presume the primary reason they don't go through the usual .Bool route is to avoid marking handling of Failures.
There are .so and .not methods defined in Mu.pm6. They just call .Bool.
There are boolean bitwise operators that include a ?. They are far adrift from your question but their code is included in the links above.

Related

inspect regex made with a variable

sub make-regex {
my $what's-in-the-box = rand > .5 ?? 'x' !! 'y';
/$what's-in-the-box/
}
my $lol-who-knows = make-regex;
$lol-who-knows.gist.say;
How do you see the innards of the regex (i.e. x or y)? Forcing a match is not a solution.
How do you see the innards of the regex (i.e. x or y)?
You:
Statically compile the regex. Doing so will involve use of a raku compiler, i.e. Rakudo.
Dynamically evaluate the regex so that the $what's-in-the-box variable in your regex gets interpolated and turns into x or y. Doing so will involve running the regex as code. That in turn means both using Rakudo and running the regex with a Match object invocant (or sub-class instance or, conceivably, a mock object equivalent).
View the resulting regex. Doing so will involve using compiler (Rakudo) toolchain specific introspection or debug functionality.
Forcing a match is not a solution.
You can run a regex with no input if your concern is just to avoid the need for a successful match:
say Match.new.&$lol-who-knows; # #<failed match>
But it must be run, otherwise the $what's-in-the-box variable won't turn into an x or y. You might think you could cheat on this by writing something that mimics this part of raku regex construction/use but there's good reason to think that's not going to work out[1].
And then you must view its innards, using Rakudo toolchain features, after you've started running it and before it finishes running.
A regex is a method
In raku, a Regex is code, a sub-class of Method.
Until you run it, by using it in a match, that code is just /$what's-in-the-box/ (aka regex { $what's-in-the-box }). It's the equivalent of something like:
method {
...self should be a Match or a sub-class of it
...if self is not an instance, create a new one
...do matching -- compiler evaluates $what's-in-the-box during this
...return updated self / new instance
}
(To see a bit more detail, see Moritz's answer to the SO Can I change the slang inside a method?.)
A routine is a closure
You create your regex inside the closure named make-regex. The compiler spots that you've used $what's-in-the-box and therefore hangs on to the variable even after the make-regex closure returns. Later on, if/when your regex is run, the $what's-in-the-box variable will be replaced by its value at the moment the regex is run.
Footnotes
[1] There are plenty of huge complications you'll encounter if you try to do an end-run around using the compiler. Even something simple like interpolation is non-trivial. To quote Jonathan Worthington, in 2020:
I thought oh I'm only building [a tool that's] not the real compiler. I can ... have a simpler model of the symbol resolution. In the end it turned out that no I couldn't really. That started to create us some problems. When we aligned the way that we resolved symbols with the same algorithms and lookup structures that was being used in the compiler then suddenly it all became a lot simpler.

What does ‘serial’ do?

From the docs that say,
Returns the self-reference to the instance itself:
my $b; # defaults to Any
say $b.serial.^name; # OUTPUT: «Any␤»
my $breakfast = 'food';
$breakfast.serial.say; # OUTPUT: «food␤»
I do not have the foggiest what this routine does, please can someone explain?
On Supplys, it is an informational method that is supposed to indicate whether there will never be any concurrent emit on that Supply.
On HyperSeq and RaceSeq, it returns a serialized Seq, so you could consider it the opposite of the hyper and race method.
In general, it appears to return itself, which seems to make sense from the HyperSeq and RaceSeq point of view.
And yes, these should be documented properly, so please create a documentation issue. Thank you!
In the doc example it does nothing. That is, if you remove it you get the same results:
my $b; # defaults to Any
say $b.^name; # OUTPUT: «Any␤»
my $breakfast = 'food';
$breakfast.say; # OUTPUT: «food␤»
More generally, I think you'd best ignore the serial method other than to open a doc issue pointing to this SO if you'd like to improve the doc.
The serial method does not appear to be in the official language
A search of the roast repo for "serial" yields zero matches.
Within Rakudo source code the method name serial has been overloaded to have one of three meanings:
A boolean declaring whether a Supply sequence is always serial. Rakudo source examples: 1, 2. This looks to me like an internal method that doesn't need to be documented.
A coercion of parallel sequence (hyper or race) to a serial version of the same sequence. This looks to me like an internal method that doesn't need to be documented.
A "no op" that returns its invocant. I suspect it would be best if it were not documented, at least until such time as its raison d'etre is clear; its official status viz-a-viz the spec (roast) is clear; and/or there's an attempt to systematically document which operations have the is nodal set on them.
None of the above seems to warrant ordinary users' attention or documentation.
The Any class definition of a serial method seems pointless
The Any class serial method returns self, i.e. when called it is a no op.
I don't currently understand why there is an Any class definition.
One possible point for it would be that there are .serial calls made by internal code on instances of an unknown and generally unknowable class and there thus needs to be a default definition of serial in the Any class.
But a search of the rakudo repo for ".serial" suggests that calls are only made to supplies or hyper/race seqs.
That said, I note the is nodal trait on the proto serial declaration in Any that immediately precedes the multi method serial declaration. Perhaps that is the reason it's in Any.
See also Arbitrary drift of methods to Mu and Any.
The documentation you quoted seems pointless
The definition and example seem to reflect someone's sense of humor. I applaud use of humor but in this case I suspect the best improvement would be to just remove the page you linked.

Should I give up grammatical correctness when naming my functions to offer regularity?

I implement several global functions in our library that look something like this:
void init_time();
void init_random();
void init_shapes();
I would like to add functions to provide a check whether those have been called:
bool is_time_initialized();
bool is_random_initialized();
bool are_shapes_initialized();
However, as you can see are_shapes_initialized falls out of the row due to the fact that shapes is plural and therefore the function name must start with are and not is. This could be a problem, as the library is rather large and not having a uniform way to group similiar functions under the same naming convention might be confusing / upsetting.
E.g. a user using IntelliSense quickly looking up function names to see if the libary offers a way to check if their initialization call happened:
They won't find are_shapes_initialized() here unless scrolling through hundreds of additional function / class names.
Just going with is_shapes_initialized() could offer clarity:
As this displays all functions, now.
But how can using wrong grammar be a good approach? Shouldn't I just assume that the user should also ask IntelliSense for "are_initialized" or just look into the documentation in the first place? Probably not, right? Should I just give up on grammatical correctness?
The way I see it, a variable is a single entity. Maybe that entity is an aggregate of other entities, such as an array or a collection, in which case it would make sense to give it a plural name e.g. a set of Shape objects could be called shapes. Even so, it is still a single object. Looking at it that way, it is grammatically acceptable to refer to it as singular. After all, is_shapes_initialized actually means "Is the variable 'shapes' initialized?"
It's the same reason we say "The Bahamas is" or "The Netherlands is", because we are referring to the singular country, not whatever plural entity it is comprised of. So yes, is_shapes_initialized can be considered grammatically correct.
It's more a matter of personal taste. I would recommend putting "is" before functions that return Boolean. This would look more like:
bool is_time_initialized();
bool is_random_initialized();
bool is_shapes_initialized();
This makes them easier to find and search for, even if they aren't grammatically correct.
You can find functions using "are" to show it is plural in places such as the DuckDuckGo app, with:
areItemsTheSame(...)
areContentsTheSame(...)
In the DuckDuckGo app, it also uses "is" to show functions return boolean, and boolean variables:
val isFullScreen: Boolean = false
isAssignableFrom(...)
In OpenTK, a C# Graphics Library, I also found usage of "are":
AreTexturesResident(...)
AreProgramsResident(...)
In the same OpenTK Libary, they use "is" singularly for functions that return boolean and boolean variables:
IsEnabledGenlock(...)
bool isControl = false;
Either usage could work. Using "are" plurally would make more sense grammatically, and using "if" plurally could make more sense for efficiency or simplifying Boolean functions.
Here's what I would do, assuming you are trying to avoid calling this function on each shape.
void init_each_shape();
bool is_each_shape_initialized();
Also assuming that you need these functions, it seems like it would make more sense to have the functions throw an exception if they do not succeed.

Boolean method naming readability

Simple question, from a readability standpoint, which method name do you prefer for a boolean method:
public boolean isUserExist(...)
or:
public boolean doesUserExist(...)
or:
public boolean userExists(...)
public boolean userExists(...)
Would be my prefered. As it makes your conditional checks far more like natural english:
if userExists ...
But I guess there is no hard and fast rule - just be consistent
I would say userExists, because 90% of the time my calling code will look like this:
if userExists(...) {
...
}
and it reads very literally in English.
if isUserExist and if doesUserExist seem redundant.
Beware of sacrificing clarity whilst chasing readability.
Although if (user.ExistsInDatabase(db)) reads nicer than if (user.CheckExistsInDatabase(db)), consider the case of a class with a builder pattern, (or any class which you can set state on):
user.WithName("Mike").ExistsInDatabase(db).ExistsInDatabase(db2).Build();
It's not clear if ExistsInDatabase is checking whether it does exist, or setting the fact that it does exist. You wouldn't write if (user.Age()) or if (user.Name()) without any comparison value, so why is if (user.Exists()) a good idea purely because that property/function is of boolean type and you can rename the function/property to read more like natural english? Is it so bad to follow the same pattern we use for other types other than booleans?
With other types, an if statement compares the return value of a function to a value in code, so the code looks something like:
if (user.GetAge() >= 18) ...
Which reads as "if user dot get age is greater than or equal to 18..." true - it's not "natural english", but I would argue that object.verb never resembled natural english and this is simply a basic facet of modern programming (for many mainstream languages). Programmers generally don't have a problem understanding the above statement, so is the following any worse?
if (user.CheckExists() == true)
Which is normally shortened to
if (user.CheckExists())
Followed by the fatal step
if (user.Exists())
Whilst it has been said that "code is read 10x more often than written", it is also very important that bugs are easy to spot. Suppose you had a function called Exists() which causes the object to exist, and returns true/false based on success. You could easily see the code if (user.Exists()) and not spot the bug - the bug would be very much more obvious if the code read if (user.SetExists()) for example.
Additionally, user.Exists() could easily contain complex or inefficient code, round tripping to a database to check something. user.CheckExists() makes it clear that the function does something.
See also all the responses here: Naming Conventions: What to name a method that returns a boolean?
As a final note - following "Tell Don't Ask", a lot of the functions that return true/false disappear anyway, and instead of asking an object for its state, you tell it to do something, which it can do in different ways based on its state.
The goal for readability should always be to write code the closest possible to natural language. So in this case, userExists seems the best choice. Using the prefix "is" may nonetheless be right in another situations, for example isProcessingComplete.
My simple rule to this question is this:
If the boolean method already HAS a verb, don't add one. Otherwise, consider it. Some examples:
$user->exists()
$user->loggedIn()
$user->isGuest() // "is" added
I would go with userExists() because 1) it makes sense in natural language, and 2) it follows the conventions of the APIs I have seen.
To see if it make sense in natural language, read it out loud. "If user exists" sounds more like a valid English phrase than "if is user exists" or "if does user exist". "If the user exists" would be better, but "the" is probably superfluous in a method name.
To see whether a file exists in Java SE 6, you would use File.exists(). This looks like it will be the same in version 7. C# uses the same convention, as do Python and Ruby. Hopefully, this is a diverse enough collection to call this a language-agnostic answer. Generally, I would side with naming methods in keeping with your language's API.
There are things to consider that I think were missed by several other answers here
It depends if this is a C++ class method or a C function. If this is a method then it will likely be called if (user.exists()) { ... } or if (user.isExisting()) { ... }
not if (user_exists(&user)) .
This is the reason behind coding standards that state bool methods should begin with a verb since they will read like a sentence when the object is in front of them.
Unfortunately lots of old C functions return 0 for success and non-zero for failure so it can be difficult to determine the style being used unless you follow the all bool functions begin with verbs or always compare to true like so if (true == user_exists(&user))
Why not rename the property then?
if (user.isPresent()) {
Purely subjective.
I prefer userExists(...) because then statements like this read better:
if ( userExists( ... ) )
or
while ( userExists( ... ) )
In this particular case, the first example is such horrible English that it makes me wince.
I'd probably go for number three because of how it sounds when reading it in if statements. "If user exists" sounds better than "If does user exists".
This is assuming it's going to be to used in if statement tests of course...
I like any of these:
userExists(...)
isUserNameTaken(...)
User.exists(...)
User.lookup(...) != null
Method names serves for readability, only the ones fit into your whole code would be the best which most of the case it begins with conditions thus subjectPredicate follows natural sentence structure.
Since I follow the convention to put verb before function name, I would do the same here too:
//method name
public boolean doesExists(...)
//this way you can also keep a variable to store the result
bool userExists = user.doesExists()
//and use it like a english phrase
if (userExists) {...}
//or you can use the method name directly also and it will make sense here too
if (user.doesExists()) {...}

Write a compiler for a language that looks ahead and multiple files?

In my language I can use a class variable in my method when the definition appears below the method. It can also call methods below my method and etc. There are no 'headers'. Take this C# example.
class A
{
public void callMethods() { print(); B b; b.notYetSeen();
public void print() { Console.Write("v = {0}", v); }
int v=9;
}
class B
{
public void notYetSeen() { Console.Write("notYetSeen()\n"); }
}
How should I compile that? what i was thinking is:
pass1: convert everything to an AST
pass2: go through all classes and build a list of define classes/variable/etc
pass3: go through code and check if there's any errors such as undefined variable, wrong use etc and create my output
But it seems like for this to work I have to do pass 1 and 2 for ALL files before doing pass3. Also it feels like a lot of work to do until I find a syntax error (other than the obvious that can be done at parse time such as forgetting to close a brace or writing 0xLETTERS instead of a hex value). My gut says there is some other way.
Note: I am using bison/flex to generate my compiler.
My understanding of languages that handle forward references is that they typically just use the first pass to build a list of valid names. Something along the lines of just putting an entry in a table (without filling out the definition) so you have something to point to later when you do your real pass to generate the definitions.
If you try to actually build full definitions as you go, you would end up having to rescan repatedly, each time saving any references to undefined things until the next pass. Even that would fail if there are circular references.
I would go through on pass one and collect all of your class/method/field names and types, ignoring the method bodies. Then in pass two check the method bodies only.
I don't know that there can be any other way than traversing all the files in the source.
I think that you can get it down to two passes - on the first pass, build the AST and whenever you find a variable name, add it to a list that contains that blocks' symbols (it would probably be useful to add that list to the corresponding scope in the tree). Step two is to linearly traverse the tree and make sure that each symbol used references a symbol in that scope or a scope above it.
My description is oversimplified but the basic answer is -- lookahead requires at least two passes.
The usual approach is to save B as "unknown". It's probably some kind of type (because of the place where you encountered it). So you can just reserve the memory (a pointer) for it even though you have no idea what it really is.
For the method call, you can't do much. In a dynamic language, you'd just save the name of the method somewhere and check whether it exists at runtime. In a static language, you can save it in under "unknown methods" somewhere in your compiler along with the unknown type B. Since method calls eventually translate to a memory address, you can again reserve the memory.
Then, when you encounter B and the method, you can clear up your unknowns. Since you know a bit about them, you can say whether they behave like they should or if the first usage is now a syntax error.
So you don't have to read all files twice but it surely makes things more simple.
Alternatively, you can generate these header files as you encounter the sources and save them somewhere where you can find them again. This way, you can speed up the compilation (since you won't have to consider unchanged files in the next compilation run).
Lastly, if you write a new language, you shouldn't use bison and flex anymore. There are much better tools by now. ANTLR, for example, can produce a parser that can recover after an error, so you can still parse the whole file. Or check this Wikipedia article for more options.