How to make a class that inherits the same methods as IO::Path? - raku

I want to build a class in Raku. Here's what I have so far:
unit class Vimwiki::File;
has Str:D $.path is required where *.IO.e;
method size {
return $.file.IO.s;
}
I'd like to get rid of the size method by simply making my class inherit the methods from IO::Path but I'm at a bit of a loss for how to accomplish this. Trying is IO::Path throws errors when I try to create a new object:
$vwf = Vimwiki::File.new(path => 't/test_file.md');
Must specify a non-empty string as a path
in block <unit> at t/01-basic.rakutest line 24

Must specify a non-empty string as a path
I always try a person's code when looking at someone's SO. Yours didn't work. (No declaration of $vwf.) That instantly alerts me that someone hasn't applied Minimal Reproducible Example principles.
So I did and less than 60 seconds later:
IO::Path.new
Yields the same error.
Why?
The doc for IO::Path.new shows its signature:
multi method new(Str:D $path, ...
So, IO::Path's new method expects a positional argument that's a Str. You (and my MRE) haven't passed a positional argument that's a Str. Thus the error message.
Of course, you've declared your own attribute $path, and have passed a named argument to set it, and that's unfortunately confused you because of the coincidence with the name path, but that's the fun of programming.
What next, take #1
Having a path attribute that duplicates IO::Path's strikes me as likely to lead to unnecessary complexity and/or bugs. So I think I'd nix that.
If all you're trying to do is wrap an additional check around the filename, then you could just write:
unit class Vimwiki::File is IO::Path;
method new ($path, |) { $path.IO.e ?? (callsame) !! die 'nope' }
callsame redispatches the ongoing routine call (the new method call), with the exact same arguments, to the next best fitting candidate(s) that would have been chosen if your new one containing the callsame hadn't been called. In this case, the next candidate(s) will be the existing new method(s) of IO::Path.
That seems fine to get started. Then you can add other attributes and methods as you see fit...
What next, take #2
...except for the IO::Path bug you filed, which means you can't initialize attributes in the normal way because IO::Path breaks the standard object construction protocol! :(
Liz shows one way to workaround this bug.
In an earlier version of this answer, I had not only showed but recommended another approach, namely delegation via handles instead of ordinary inheritance. I have since concluded that that was over-complicating things, and so removed it from this answer. And then I read your issue!
So I guess the delegation approach might still be appropriate as a workaround for a bug. So if later readers want to see it in action, follow #sdondley's link to their code. But I'm leaving it out of this (hopefully final! famous last words...) version of this answer in the hope that by the time you (later reader) read this, you just need to do something really simple like take #1.

Related

How can one invoke the non-extension `run` function (the one without scope / "object reference") in environments where there is an object scope?

Example:
data class T(val flag: Boolean) {
constructor(n: Int) : this(run {
// Some computation here...
<Boolean result>
})
}
In this example, the custom constructor needs to run some computation in order to determine which value to pass to the primary constructor, but the compiler does not accept the run, citing Cannot access 'run' before superclass constructor has been called, which, if I understand correctly, means instead of interpreting it as the non-extension run (the variant with no object reference in https://kotlinlang.org/docs/reference/scope-functions.html#function-selection), it construes it as a call to this.run (the variant with an object reference in the above table) - which is invalid as the object has not completely instantiated yet.
What can I do in order to let the compiler know I mean the run function which is not an extension method and doesn't take a scope?
Clarification: I am interested in an answer to the question as asked, not in a workaround.
I can think of several workarounds - ways to rewrite this code in a way that works as intended without calling run: extracting the code to a function; rewriting it as a (possibly highly nested) let expression; removing the run and invoking the lambda (with () after it) instead (funnily enough, IntelliJ IDEA tags that as Redundant lambda creation and suggests to Inline the body, which reinstates the non-compiling run). But the question is not how to rewrite this without using run - it's how to make run work in this context.
A good answer should do one of the following things:
Explain how to instruct the compiler to call a function rather than an extension method when a name is overloaded, in general; or
Explain how to do that specifically for run; or
Explain that (and ideally also why) it is not possible to do (ideally with supporting references); or
Explain what I got wrong, in case I got something wrong and the whole question is irrelevant (e.g. if my analysis is incorrect, and the problem is something other than the compiler construing the call to run as this.run).
If someone has a neat workaround not mentioned above they're welcome to post it in a comment - not as an answer.
In case it matters: I'm using multi-platform Kotlin 1.4.20.
Kotlin favors the receiver overload if it is in scope. The solution is to use the fully qualified name of the non-receiver function:
kotlin.run { //...
The specification is explained here.
Another option when the overloads are not in the same package is to use import renaming, but that won't work in this case since both run functions are in the same package.

Kotlin syntax issue

Sorry for the terrible title, but I can't seem to find an allowable way to ask this question, because I don't know how to refer to the code constructs I am looking at.
Looking at this file: https://github.com/Hexworks/caves-of-zircon-tutorial/blob/master/src/main/kotlin/org/hexworks/cavesofzircon/systems/InputReceiver.kt
I don't understand what is going on here:
override fun update(entity: GameEntity<out EntityType>, context: GameContext): Boolean {
val (_, _, uiEvent, player) = context
I can understand some things.
We are overriding the update function, which is defined in the Behavior class, which is a superclass of this class.
The update function accepts two parameters. A GameEntity named entity, and a GameContext called context.
The function returns a Boolean result.
However, I do not understand the next line at all. Just open and close parentheses, two underscores as the first two parameters, and then an assignment to the context argument. What is it we are assigning the value of context to?
Based on IDE behavior, apparently the open-close parentheses are related to the constructor for GameContext. But I would not know that otherwise. I also don't understand what the meaning is of the underscores in the argument list.
And finally, I have read about the declaration-site variance keyword "out", but I don't really understand what it means here. We have GameEntity<out EntityType>. So as I understand it, that means this method produces EntityType, but does not consume it. How is that demonstrated in this code?
val (_, _, uiEvent, player) = context
You are extracting the 3rd and 4th value from the context and ignoring the first two.
Compare https://kotlinlang.org/docs/reference/multi-declarations.html .
About out: i don't see it being used in the code snippet you're showing. You might want to show the full method.
Also, maybe it is there only for the purpose of overriding the method, to match the signature of the function.
To cover the little bit that Incubbus's otherwise-great answer missed:
In the declaration
override fun update(entity: GameEntity<out EntityType>, // …
the out means that you could call the function and pass a GameEntity<SubclassOfEntityType> (or even a SubclassOfGameEntity<SubclassOfEntityType>).
With no out, you'd have to pass a GameEntity<EntityType> (or a SubclassOfGameEntity<EntityType>).
I guess that's inherited from the superclass method that you're overriding.  After all, if the superclass method could be called with a GameEntity<SubclassOfEntityType>, then your override will need to handle that too.  (The Liskov substitution principle in action!)

What does ‘serial’ do?

From the docs that say,
Returns the self-reference to the instance itself:
my $b; # defaults to Any
say $b.serial.^name; # OUTPUT: «Any␤»
my $breakfast = 'food';
$breakfast.serial.say; # OUTPUT: «food␤»
I do not have the foggiest what this routine does, please can someone explain?
On Supplys, it is an informational method that is supposed to indicate whether there will never be any concurrent emit on that Supply.
On HyperSeq and RaceSeq, it returns a serialized Seq, so you could consider it the opposite of the hyper and race method.
In general, it appears to return itself, which seems to make sense from the HyperSeq and RaceSeq point of view.
And yes, these should be documented properly, so please create a documentation issue. Thank you!
In the doc example it does nothing. That is, if you remove it you get the same results:
my $b; # defaults to Any
say $b.^name; # OUTPUT: «Any␤»
my $breakfast = 'food';
$breakfast.say; # OUTPUT: «food␤»
More generally, I think you'd best ignore the serial method other than to open a doc issue pointing to this SO if you'd like to improve the doc.
The serial method does not appear to be in the official language
A search of the roast repo for "serial" yields zero matches.
Within Rakudo source code the method name serial has been overloaded to have one of three meanings:
A boolean declaring whether a Supply sequence is always serial. Rakudo source examples: 1, 2. This looks to me like an internal method that doesn't need to be documented.
A coercion of parallel sequence (hyper or race) to a serial version of the same sequence. This looks to me like an internal method that doesn't need to be documented.
A "no op" that returns its invocant. I suspect it would be best if it were not documented, at least until such time as its raison d'etre is clear; its official status viz-a-viz the spec (roast) is clear; and/or there's an attempt to systematically document which operations have the is nodal set on them.
None of the above seems to warrant ordinary users' attention or documentation.
The Any class definition of a serial method seems pointless
The Any class serial method returns self, i.e. when called it is a no op.
I don't currently understand why there is an Any class definition.
One possible point for it would be that there are .serial calls made by internal code on instances of an unknown and generally unknowable class and there thus needs to be a default definition of serial in the Any class.
But a search of the rakudo repo for ".serial" suggests that calls are only made to supplies or hyper/race seqs.
That said, I note the is nodal trait on the proto serial declaration in Any that immediately precedes the multi method serial declaration. Perhaps that is the reason it's in Any.
See also Arbitrary drift of methods to Mu and Any.
The documentation you quoted seems pointless
The definition and example seem to reflect someone's sense of humor. I applaud use of humor but in this case I suspect the best improvement would be to just remove the page you linked.

What is indirect object notation, why is it bad, and how does one avoid it?

The title pretty much sums it up, but here's the long version anyway.
After posting a small snippet of perl code, I was told to avoid indirect object notation, "as it has several side effects". The comment referenced this particular line:
my $some_object = new Some::Module(FIELD => 'value');
As this is how I've always done it, in an effort to get with the times I therefore ask:
What's so bad about it? (specifically)
What are the potential (presumably negative) side effects?
How should that line be rewritten?
I was about to ask the commenter, but to me this is worthy of its own post.
The main problem is that it's ambiguous. Does
my $some_object = new Some::Module(FIELD => 'value');
mean to call the new method in the Some::Module package, or does it mean to call the new function in the current package with the result of calling the Module function in the Some package with the given parameters?
i.e, it could be parsed as:
# method call
my $some_object = Some::Module->new(FIELD => 'value');
# or function call
my $some_object = new(Some::Module(FIELD => 'value'));
The alternative is to use the explicit method call notation Some::Module->new(...).
Normally, the parser guesses correctly, but the best practice is to avoid the ambiguity.
What's so bad about it?
The problems with Indirect Method Notation are avoidable, but it's far easier to tell people to avoid Indirect Method Notation.
The main problem it's very easy to call the wrong function by accident. Take the following code, for example:
package Widget;
sub new { ... }
sub foo { ... }
sub bar { ... }
sub method {
...;
my $o = new SubWidget;
...;
}
1;
In that code, new SubWidget is expected to mean
SubWidget->new()
Instead, it actually means
new("SubWidget")
That said, using strict will catch most of these instances of this error. Were use strict; to be added to the above snippet, the following error would be produced:
Bareword "SubWidget" not allowed while "strict subs" in use at Widget.pm line 11.
That said, there are cases where using strict would not catch the error. They primarily involve the use of parens around the arguments of the method call (e.g. new SubWidget($x)).
So that means
Using Indirect Object Notation without parens can result in odd error messages.
Using Indirect Object Notation with parens can result in the wrong code being called.
The former is bearable, and the latter is avoidable. But rather than telling people "avoid using parens around the arguments of method calls using Indirect Method Notation", we simply tell people "avoid using Indirect Method Notation". It's just too fragile.
There's another issue. It's not just using Indirect Object Notation that's a problem, it's supporting it in Perl. The existence of the feature causes multiple problems. Primarily,
It causes some syntax errors to result in very odd/misleading error messages because the code appeared to be using ION when it wasn't.
It prevents useful features from being implemented since they clash with valid ION syntax.
On the plus side, using no indirect; helps the first problem.
How should that line be rewritten?
The correct way to write the method call is the following:
my $some_object = Some::Module->new(FIELD => 'value');
That said, even this syntax is ambiguous. It will first check if a function named Some::Module exists. But that's so very unlikely that very few people protect themselves from such problems. If you wanted to protect yourself, you could use the following:
my $some_object = Some::Module::->new(FIELD => 'value');
As to how to avoid it: There's a CPAN module that forbids the notation, acting like a pragma module:
no indirect;
http://metacpan.org/pod/indirect
The commenter just wanted to see Some::Module->new(FIELD => 'value'); as the constructor.
Perl can use indirect object syntax for other bare words that look like they might be methods, but nowadays the perlobj documentation suggests not to use it.
The general problem with it is that code written this way is ambiguous and exercises Perl's parser to test the namespace to e.g. check when you write method Namespace whether Namespace::method exists.

Write a compiler for a language that looks ahead and multiple files?

In my language I can use a class variable in my method when the definition appears below the method. It can also call methods below my method and etc. There are no 'headers'. Take this C# example.
class A
{
public void callMethods() { print(); B b; b.notYetSeen();
public void print() { Console.Write("v = {0}", v); }
int v=9;
}
class B
{
public void notYetSeen() { Console.Write("notYetSeen()\n"); }
}
How should I compile that? what i was thinking is:
pass1: convert everything to an AST
pass2: go through all classes and build a list of define classes/variable/etc
pass3: go through code and check if there's any errors such as undefined variable, wrong use etc and create my output
But it seems like for this to work I have to do pass 1 and 2 for ALL files before doing pass3. Also it feels like a lot of work to do until I find a syntax error (other than the obvious that can be done at parse time such as forgetting to close a brace or writing 0xLETTERS instead of a hex value). My gut says there is some other way.
Note: I am using bison/flex to generate my compiler.
My understanding of languages that handle forward references is that they typically just use the first pass to build a list of valid names. Something along the lines of just putting an entry in a table (without filling out the definition) so you have something to point to later when you do your real pass to generate the definitions.
If you try to actually build full definitions as you go, you would end up having to rescan repatedly, each time saving any references to undefined things until the next pass. Even that would fail if there are circular references.
I would go through on pass one and collect all of your class/method/field names and types, ignoring the method bodies. Then in pass two check the method bodies only.
I don't know that there can be any other way than traversing all the files in the source.
I think that you can get it down to two passes - on the first pass, build the AST and whenever you find a variable name, add it to a list that contains that blocks' symbols (it would probably be useful to add that list to the corresponding scope in the tree). Step two is to linearly traverse the tree and make sure that each symbol used references a symbol in that scope or a scope above it.
My description is oversimplified but the basic answer is -- lookahead requires at least two passes.
The usual approach is to save B as "unknown". It's probably some kind of type (because of the place where you encountered it). So you can just reserve the memory (a pointer) for it even though you have no idea what it really is.
For the method call, you can't do much. In a dynamic language, you'd just save the name of the method somewhere and check whether it exists at runtime. In a static language, you can save it in under "unknown methods" somewhere in your compiler along with the unknown type B. Since method calls eventually translate to a memory address, you can again reserve the memory.
Then, when you encounter B and the method, you can clear up your unknowns. Since you know a bit about them, you can say whether they behave like they should or if the first usage is now a syntax error.
So you don't have to read all files twice but it surely makes things more simple.
Alternatively, you can generate these header files as you encounter the sources and save them somewhere where you can find them again. This way, you can speed up the compilation (since you won't have to consider unchanged files in the next compilation run).
Lastly, if you write a new language, you shouldn't use bison and flex anymore. There are much better tools by now. ANTLR, for example, can produce a parser that can recover after an error, so you can still parse the whole file. Or check this Wikipedia article for more options.