What is indirect object notation, why is it bad, and how does one avoid it? - oop

The title pretty much sums it up, but here's the long version anyway.
After posting a small snippet of perl code, I was told to avoid indirect object notation, "as it has several side effects". The comment referenced this particular line:
my $some_object = new Some::Module(FIELD => 'value');
As this is how I've always done it, in an effort to get with the times I therefore ask:
What's so bad about it? (specifically)
What are the potential (presumably negative) side effects?
How should that line be rewritten?
I was about to ask the commenter, but to me this is worthy of its own post.

The main problem is that it's ambiguous. Does
my $some_object = new Some::Module(FIELD => 'value');
mean to call the new method in the Some::Module package, or does it mean to call the new function in the current package with the result of calling the Module function in the Some package with the given parameters?
i.e, it could be parsed as:
# method call
my $some_object = Some::Module->new(FIELD => 'value');
# or function call
my $some_object = new(Some::Module(FIELD => 'value'));
The alternative is to use the explicit method call notation Some::Module->new(...).
Normally, the parser guesses correctly, but the best practice is to avoid the ambiguity.

What's so bad about it?
The problems with Indirect Method Notation are avoidable, but it's far easier to tell people to avoid Indirect Method Notation.
The main problem it's very easy to call the wrong function by accident. Take the following code, for example:
package Widget;
sub new { ... }
sub foo { ... }
sub bar { ... }
sub method {
...;
my $o = new SubWidget;
...;
}
1;
In that code, new SubWidget is expected to mean
SubWidget->new()
Instead, it actually means
new("SubWidget")
That said, using strict will catch most of these instances of this error. Were use strict; to be added to the above snippet, the following error would be produced:
Bareword "SubWidget" not allowed while "strict subs" in use at Widget.pm line 11.
That said, there are cases where using strict would not catch the error. They primarily involve the use of parens around the arguments of the method call (e.g. new SubWidget($x)).
So that means
Using Indirect Object Notation without parens can result in odd error messages.
Using Indirect Object Notation with parens can result in the wrong code being called.
The former is bearable, and the latter is avoidable. But rather than telling people "avoid using parens around the arguments of method calls using Indirect Method Notation", we simply tell people "avoid using Indirect Method Notation". It's just too fragile.
There's another issue. It's not just using Indirect Object Notation that's a problem, it's supporting it in Perl. The existence of the feature causes multiple problems. Primarily,
It causes some syntax errors to result in very odd/misleading error messages because the code appeared to be using ION when it wasn't.
It prevents useful features from being implemented since they clash with valid ION syntax.
On the plus side, using no indirect; helps the first problem.
How should that line be rewritten?
The correct way to write the method call is the following:
my $some_object = Some::Module->new(FIELD => 'value');
That said, even this syntax is ambiguous. It will first check if a function named Some::Module exists. But that's so very unlikely that very few people protect themselves from such problems. If you wanted to protect yourself, you could use the following:
my $some_object = Some::Module::->new(FIELD => 'value');

As to how to avoid it: There's a CPAN module that forbids the notation, acting like a pragma module:
no indirect;
http://metacpan.org/pod/indirect

The commenter just wanted to see Some::Module->new(FIELD => 'value'); as the constructor.
Perl can use indirect object syntax for other bare words that look like they might be methods, but nowadays the perlobj documentation suggests not to use it.
The general problem with it is that code written this way is ambiguous and exercises Perl's parser to test the namespace to e.g. check when you write method Namespace whether Namespace::method exists.

Related

How to make a class that inherits the same methods as IO::Path?

I want to build a class in Raku. Here's what I have so far:
unit class Vimwiki::File;
has Str:D $.path is required where *.IO.e;
method size {
return $.file.IO.s;
}
I'd like to get rid of the size method by simply making my class inherit the methods from IO::Path but I'm at a bit of a loss for how to accomplish this. Trying is IO::Path throws errors when I try to create a new object:
$vwf = Vimwiki::File.new(path => 't/test_file.md');
Must specify a non-empty string as a path
in block <unit> at t/01-basic.rakutest line 24
Must specify a non-empty string as a path
I always try a person's code when looking at someone's SO. Yours didn't work. (No declaration of $vwf.) That instantly alerts me that someone hasn't applied Minimal Reproducible Example principles.
So I did and less than 60 seconds later:
IO::Path.new
Yields the same error.
Why?
The doc for IO::Path.new shows its signature:
multi method new(Str:D $path, ...
So, IO::Path's new method expects a positional argument that's a Str. You (and my MRE) haven't passed a positional argument that's a Str. Thus the error message.
Of course, you've declared your own attribute $path, and have passed a named argument to set it, and that's unfortunately confused you because of the coincidence with the name path, but that's the fun of programming.
What next, take #1
Having a path attribute that duplicates IO::Path's strikes me as likely to lead to unnecessary complexity and/or bugs. So I think I'd nix that.
If all you're trying to do is wrap an additional check around the filename, then you could just write:
unit class Vimwiki::File is IO::Path;
method new ($path, |) { $path.IO.e ?? (callsame) !! die 'nope' }
callsame redispatches the ongoing routine call (the new method call), with the exact same arguments, to the next best fitting candidate(s) that would have been chosen if your new one containing the callsame hadn't been called. In this case, the next candidate(s) will be the existing new method(s) of IO::Path.
That seems fine to get started. Then you can add other attributes and methods as you see fit...
What next, take #2
...except for the IO::Path bug you filed, which means you can't initialize attributes in the normal way because IO::Path breaks the standard object construction protocol! :(
Liz shows one way to workaround this bug.
In an earlier version of this answer, I had not only showed but recommended another approach, namely delegation via handles instead of ordinary inheritance. I have since concluded that that was over-complicating things, and so removed it from this answer. And then I read your issue!
So I guess the delegation approach might still be appropriate as a workaround for a bug. So if later readers want to see it in action, follow #sdondley's link to their code. But I'm leaving it out of this (hopefully final! famous last words...) version of this answer in the hope that by the time you (later reader) read this, you just need to do something really simple like take #1.

Variable re-assign method result if not Nil

Is there idiomatic way of applying and assigning object variable method call, but only if it's defined (both method and the result)?
Like using safe call operator .? and defined-or operator //, and with "DRY principle" – using the variable only once in the operation?
Like this (but using another variable feels like cheating):
my $nicevariable = "fobar";
# key step
(my $x := $nicevariable) = $x.?possibly-nonexistent-meth // $x;
say $nicevariable; # => possibly-nonexistent-meth (non-Nil) result or "foobar"
... And avoiding andthen, if possible.
Are you aware of the default trait on variables?
Not sure if it fits your use case, but $some-var.?unknown-method returns Nil, so:
my $nicevariable is default("fobar");
$nicevariable = $nicevariable.?possibly-nonexistent-meth;
say $nicevariable; # fobar
results in $nicevariable being reset to its default value;
I'm not entirely sure what you mean by "using the variable only once in the operation". If Liz's answer qualifies, then it's probably the cleaner way to go.
If not, here's a different approach that avoids naming the variable twice:
my $nicevariable = "foobar";
$nicevariable.=&{.^lookup('possibly-nonexistent-meth')($_)}
This is a bit too cryptic for my tastes; here's what it does: If the method exists, then that's similar to¹ calling &method($nicevariable), which is the same as $nicevariable.method. If the method does not exist, then it's like calling Mu($nicevariable) – that is, coercing $nicevariable into Mu or a subtype of Mu. But since everything is already a subtype of Mu, that's a no-op and just returns $nicevariable.
[1]: Not quite, since &method would be a Sub, but basically.
EDIT:
Actually, that was over-complicating things. Here's a simpler version:
my $nicevariable = "foobar";
$nicevariable.=&{.?possibly-nonexistent-meth // $_}
Not sure why I didn't just have that to begin with…
Assuming the method returns something that is defined, you could do (if I'm understanding the question correctly):
my $x = "foobar";
$x = $_ with $y.?possibly-nonexistent-meth;
$x will remain unchanged if the method didn't exist (or it did exist and returned a type object).
As of this merge you can write:
try { let $foo .= bar }
This is a step short of what I think would have been an ideal solution, which is unfortunately a syntax error (due to a pervasive weakness in Raku's current grammar that I'm guessing is effectively unsolvable, much as I would love it to be solved):
{ let $foo .?= bar } # Malformed postfix call...
(Perhaps I'm imagining things, but I see a glimmer of hope that the above wrinkle (and many like it) will be smoothed over a few years from now. This would be after RakuAST lands for Raku .e, and the grammar gets a hoped for clean up in Raku .f or .g.)
Your question's title is:
Variable re-assign method result if not Nil
My solution does a variable re-assign method if not undefined, which is more general than just Nil.
Then again, your question's body asks for exactly that more general solution:
Is there idiomatic way of applying and assigning object variable method call, but only if it's defined (both method and the result)?
So is my solution an ideal one?
My solution is not idiomatic. But that might well be because of the bug I found, now solved with the merge linked at the start of my answer. I see no reason why it should not become idiomatic, once it's in shipping Rakudos.
The potentially big issue is that the try stores any exception thrown in $! rather than letting it blow up. Perhaps that's OK for a given use case; perhaps not.
Special thanks to you for asking your question, which prompted us to come up with various solutions, which led me to file an issue, which led vrurg to both analyse the problem I encountered and then fix it. :)

How can one invoke the non-extension `run` function (the one without scope / "object reference") in environments where there is an object scope?

Example:
data class T(val flag: Boolean) {
constructor(n: Int) : this(run {
// Some computation here...
<Boolean result>
})
}
In this example, the custom constructor needs to run some computation in order to determine which value to pass to the primary constructor, but the compiler does not accept the run, citing Cannot access 'run' before superclass constructor has been called, which, if I understand correctly, means instead of interpreting it as the non-extension run (the variant with no object reference in https://kotlinlang.org/docs/reference/scope-functions.html#function-selection), it construes it as a call to this.run (the variant with an object reference in the above table) - which is invalid as the object has not completely instantiated yet.
What can I do in order to let the compiler know I mean the run function which is not an extension method and doesn't take a scope?
Clarification: I am interested in an answer to the question as asked, not in a workaround.
I can think of several workarounds - ways to rewrite this code in a way that works as intended without calling run: extracting the code to a function; rewriting it as a (possibly highly nested) let expression; removing the run and invoking the lambda (with () after it) instead (funnily enough, IntelliJ IDEA tags that as Redundant lambda creation and suggests to Inline the body, which reinstates the non-compiling run). But the question is not how to rewrite this without using run - it's how to make run work in this context.
A good answer should do one of the following things:
Explain how to instruct the compiler to call a function rather than an extension method when a name is overloaded, in general; or
Explain how to do that specifically for run; or
Explain that (and ideally also why) it is not possible to do (ideally with supporting references); or
Explain what I got wrong, in case I got something wrong and the whole question is irrelevant (e.g. if my analysis is incorrect, and the problem is something other than the compiler construing the call to run as this.run).
If someone has a neat workaround not mentioned above they're welcome to post it in a comment - not as an answer.
In case it matters: I'm using multi-platform Kotlin 1.4.20.
Kotlin favors the receiver overload if it is in scope. The solution is to use the fully qualified name of the non-receiver function:
kotlin.run { //...
The specification is explained here.
Another option when the overloads are not in the same package is to use import renaming, but that won't work in this case since both run functions are in the same package.

What happened to Rhino Mocks' Arg<T>.Property

Arg<T>.Property is part of the documentation on inline constraints for Rhino Mocks v3.5, but I can't find it in v.3.6. What happened?
The documentation is here: http://ayende.com/Wiki/Rhino+Mocks+3.5.ashx?AspxAutoDetectCookieSupport=1#SimpleConstraints
and Arg<T>.Property is mentioned in the constraints reference table.
This seems to be a bug in the documentation. When examining the Rhino.Mocks.dll (3.6.0.0) in the Object Browser I see that Rhino.Mocks.Arg<T> only offers the methods Is and List but not Property.
However Rhino.Mocks.Constraints contains the Property class. Using the "old" syntax you should be able to do the same:
AAA syntax (producing compile error):
myStub.Expect(x => x.MethodToCall(Arg<T>.Property.Value("PropertyName", myDesiredPropertyValue))).Result(myMockResult);
Old syntax (working):
myStub.Expect(x => x.MethodToCall(null)).Constraints(Property.Value("PropertyName", myDesiredPropertyValue)).Result(myMockResult);
The documentation says "You're probably used to IgnoreArguments(), Constraints() and RefOut(). [...] It is encouraged to use only Arg<T>, it is more consistent and easier to understand, even if sometimes a little more to write."
As Jeff Bridgman noted, you can also use Arg<T>.Matches:
myStub.Expect(x => x.MethodToCall(Arg<T>.Matches(m => m.PropertyName == myDesiredPropertyValue))).Result(myMockResult);
It has the advantage of being 'refactory safe', meaning that you can refactor the name of the property safely without the need to search for any 'magic strings'. It also meets the suggestion in the documentation to rather use Arg<T> instead of Constraints().

Write a compiler for a language that looks ahead and multiple files?

In my language I can use a class variable in my method when the definition appears below the method. It can also call methods below my method and etc. There are no 'headers'. Take this C# example.
class A
{
public void callMethods() { print(); B b; b.notYetSeen();
public void print() { Console.Write("v = {0}", v); }
int v=9;
}
class B
{
public void notYetSeen() { Console.Write("notYetSeen()\n"); }
}
How should I compile that? what i was thinking is:
pass1: convert everything to an AST
pass2: go through all classes and build a list of define classes/variable/etc
pass3: go through code and check if there's any errors such as undefined variable, wrong use etc and create my output
But it seems like for this to work I have to do pass 1 and 2 for ALL files before doing pass3. Also it feels like a lot of work to do until I find a syntax error (other than the obvious that can be done at parse time such as forgetting to close a brace or writing 0xLETTERS instead of a hex value). My gut says there is some other way.
Note: I am using bison/flex to generate my compiler.
My understanding of languages that handle forward references is that they typically just use the first pass to build a list of valid names. Something along the lines of just putting an entry in a table (without filling out the definition) so you have something to point to later when you do your real pass to generate the definitions.
If you try to actually build full definitions as you go, you would end up having to rescan repatedly, each time saving any references to undefined things until the next pass. Even that would fail if there are circular references.
I would go through on pass one and collect all of your class/method/field names and types, ignoring the method bodies. Then in pass two check the method bodies only.
I don't know that there can be any other way than traversing all the files in the source.
I think that you can get it down to two passes - on the first pass, build the AST and whenever you find a variable name, add it to a list that contains that blocks' symbols (it would probably be useful to add that list to the corresponding scope in the tree). Step two is to linearly traverse the tree and make sure that each symbol used references a symbol in that scope or a scope above it.
My description is oversimplified but the basic answer is -- lookahead requires at least two passes.
The usual approach is to save B as "unknown". It's probably some kind of type (because of the place where you encountered it). So you can just reserve the memory (a pointer) for it even though you have no idea what it really is.
For the method call, you can't do much. In a dynamic language, you'd just save the name of the method somewhere and check whether it exists at runtime. In a static language, you can save it in under "unknown methods" somewhere in your compiler along with the unknown type B. Since method calls eventually translate to a memory address, you can again reserve the memory.
Then, when you encounter B and the method, you can clear up your unknowns. Since you know a bit about them, you can say whether they behave like they should or if the first usage is now a syntax error.
So you don't have to read all files twice but it surely makes things more simple.
Alternatively, you can generate these header files as you encounter the sources and save them somewhere where you can find them again. This way, you can speed up the compilation (since you won't have to consider unchanged files in the next compilation run).
Lastly, if you write a new language, you shouldn't use bison and flex anymore. There are much better tools by now. ANTLR, for example, can produce a parser that can recover after an error, so you can still parse the whole file. Or check this Wikipedia article for more options.