SLICC indexing syntax - gem5

I have been digging through the cache-related parts of Gem5 (particularly the parts related to directories), and I've hit a bit of a snag.
This is the code for getDirectoryEntry(Addr addr), in src/mem/ruby/protocol/MESI_Two_Level-dir.sm:
Entry getDirectoryEntry(Addr addr), return_by_pointer="yes" {
Entry dir_entry := static_cast(Entry, "pointer", directory[addr]);
if (is_valid(dir_entry)) {
return dir_entry;
}
dir_entry := static_cast(Entry, "pointer",
directory.allocate(addr, new Entry));
return dir_entry;
}
Note the first line inside the function, where it says directory[addr].
directory was previously defined like so:
machine(MachineType:Directory, "MESI Two Level directory protocol")
: DirectoryMemory * directory;
...
I am trying to understand exactly what that directory[addr] bit of code means. Intuitively, it may be calling the C++ DirectoryMemory::lookup(Addr address) method, but I haven't found any code or documentation that supports that guess.
The DirectoryMemory class doesn't define an indexing operator, and there's also nothing in the SLICC page on the wiki that describes an indexing operator.
tl;dr: what does the indexing operator mean in SLICC? It it's defined for particular objects somewhere in the SLICC code, what should I be looking for to find its definition?
Thanks in advance!

Figured it out myself. It calls the lookup method in DirectoryMemory, as I suspected.

Related

How to make a class that inherits the same methods as IO::Path?

I want to build a class in Raku. Here's what I have so far:
unit class Vimwiki::File;
has Str:D $.path is required where *.IO.e;
method size {
return $.file.IO.s;
}
I'd like to get rid of the size method by simply making my class inherit the methods from IO::Path but I'm at a bit of a loss for how to accomplish this. Trying is IO::Path throws errors when I try to create a new object:
$vwf = Vimwiki::File.new(path => 't/test_file.md');
Must specify a non-empty string as a path
in block <unit> at t/01-basic.rakutest line 24
Must specify a non-empty string as a path
I always try a person's code when looking at someone's SO. Yours didn't work. (No declaration of $vwf.) That instantly alerts me that someone hasn't applied Minimal Reproducible Example principles.
So I did and less than 60 seconds later:
IO::Path.new
Yields the same error.
Why?
The doc for IO::Path.new shows its signature:
multi method new(Str:D $path, ...
So, IO::Path's new method expects a positional argument that's a Str. You (and my MRE) haven't passed a positional argument that's a Str. Thus the error message.
Of course, you've declared your own attribute $path, and have passed a named argument to set it, and that's unfortunately confused you because of the coincidence with the name path, but that's the fun of programming.
What next, take #1
Having a path attribute that duplicates IO::Path's strikes me as likely to lead to unnecessary complexity and/or bugs. So I think I'd nix that.
If all you're trying to do is wrap an additional check around the filename, then you could just write:
unit class Vimwiki::File is IO::Path;
method new ($path, |) { $path.IO.e ?? (callsame) !! die 'nope' }
callsame redispatches the ongoing routine call (the new method call), with the exact same arguments, to the next best fitting candidate(s) that would have been chosen if your new one containing the callsame hadn't been called. In this case, the next candidate(s) will be the existing new method(s) of IO::Path.
That seems fine to get started. Then you can add other attributes and methods as you see fit...
What next, take #2
...except for the IO::Path bug you filed, which means you can't initialize attributes in the normal way because IO::Path breaks the standard object construction protocol! :(
Liz shows one way to workaround this bug.
In an earlier version of this answer, I had not only showed but recommended another approach, namely delegation via handles instead of ordinary inheritance. I have since concluded that that was over-complicating things, and so removed it from this answer. And then I read your issue!
So I guess the delegation approach might still be appropriate as a workaround for a bug. So if later readers want to see it in action, follow #sdondley's link to their code. But I'm leaving it out of this (hopefully final! famous last words...) version of this answer in the hope that by the time you (later reader) read this, you just need to do something really simple like take #1.

How can one invoke the non-extension `run` function (the one without scope / "object reference") in environments where there is an object scope?

Example:
data class T(val flag: Boolean) {
constructor(n: Int) : this(run {
// Some computation here...
<Boolean result>
})
}
In this example, the custom constructor needs to run some computation in order to determine which value to pass to the primary constructor, but the compiler does not accept the run, citing Cannot access 'run' before superclass constructor has been called, which, if I understand correctly, means instead of interpreting it as the non-extension run (the variant with no object reference in https://kotlinlang.org/docs/reference/scope-functions.html#function-selection), it construes it as a call to this.run (the variant with an object reference in the above table) - which is invalid as the object has not completely instantiated yet.
What can I do in order to let the compiler know I mean the run function which is not an extension method and doesn't take a scope?
Clarification: I am interested in an answer to the question as asked, not in a workaround.
I can think of several workarounds - ways to rewrite this code in a way that works as intended without calling run: extracting the code to a function; rewriting it as a (possibly highly nested) let expression; removing the run and invoking the lambda (with () after it) instead (funnily enough, IntelliJ IDEA tags that as Redundant lambda creation and suggests to Inline the body, which reinstates the non-compiling run). But the question is not how to rewrite this without using run - it's how to make run work in this context.
A good answer should do one of the following things:
Explain how to instruct the compiler to call a function rather than an extension method when a name is overloaded, in general; or
Explain how to do that specifically for run; or
Explain that (and ideally also why) it is not possible to do (ideally with supporting references); or
Explain what I got wrong, in case I got something wrong and the whole question is irrelevant (e.g. if my analysis is incorrect, and the problem is something other than the compiler construing the call to run as this.run).
If someone has a neat workaround not mentioned above they're welcome to post it in a comment - not as an answer.
In case it matters: I'm using multi-platform Kotlin 1.4.20.
Kotlin favors the receiver overload if it is in scope. The solution is to use the fully qualified name of the non-receiver function:
kotlin.run { //...
The specification is explained here.
Another option when the overloads are not in the same package is to use import renaming, but that won't work in this case since both run functions are in the same package.

Alter how arguments are processed before they're passed to sub MAIN

Given the documentation and the comments on an earlier question, by request I've made a minimal reproducible example that demonstrates a difference between these two statements:
my %*SUB-MAIN-OPTS = :named-anywhere;
PROCESS::<%SUB-MAIN-OPTS><named-anywhere> = True;
Given a script file with only this:
#!/usr/bin/env raku
use MyApp::Tools::CLI;
and a module file in MyApp/Tools called CLI.pm6:
#PROCESS::<%SUB-MAIN-OPTS><named-anywhere> = True;
my %*SUB-MAIN-OPTS = :named-anywhere;
proto MAIN(|) is export {*}
multi MAIN( 'add', :h( :$hostnames ) ) {
for #$hostnames -> $host {
say $host;
}
}
multi MAIN( 'remove', *#hostnames ) {
for #hostnames -> $host {
say $host;
}
}
The following invocation from the command line will not result in a recognized subroutine, but show the usage:
mre.raku add -h=localhost -h=test1
Switching my %*SUB-MAIN-OPTS = :named-anywhere; for PROCESS::<%SUB-MAIN-OPTS><named-anywhere> = True; will print two lines with the two hostnames provided, as expected.
If however, this is done in a single file as below, both work identical:
#!/usr/bin/env raku
#PROCESS::<%SUB-MAIN-OPTS><named-anywhere> = True;
my %*SUB-MAIN-OPTS = :named-anywhere;
proto MAIN(|) is export {*}
multi MAIN( 'add', :h( :$hostnames )) {
for #$hostnames -> $host {
say $host;
}
}
multi MAIN( 'remove', *#hostnames ) {
for #hostnames -> $host {
say $host;
}
}
I find this hard to understand.
When reproducing this, be alert of how each command must be called.
mre.raku remove localhost test1
mre.raku add -h=localhost -h=test1
So a named array-reference is not recognized when this is used in a separate file with my %*SUB-MAIN-OPTS = :named-anywhere;. While PROCESS::<%SUB-MAIN-OPTS><named-anywhere> = True; always works. And for a slurpy array, both work identical in both cases.
The problem is that it isn't the same variable in both the script and in the module.
Sure they have the same name, but that doesn't mean much.
my \A = anon class Foo {}
my \B = anon class Foo {}
A ~~ B; # False
B ~~ A; # False
A === B; # False
Those two classes have the same name, but are separate entities.
If you look at the code for other built-in dynamic variables, you see something like:
Rakudo::Internals.REGISTER-DYNAMIC: '$*EXECUTABLE-NAME', {
PROCESS::<$EXECUTABLE-NAME> := $*EXECUTABLE.basename;
}
This makes sure that the variable is installed into the right place so that it works for every compilation unit.
If you look for %*SUB-MAIN-OPTS, the only thing you find is this line:
my %sub-main-opts := %*SUB-MAIN-OPTS // {};
That looks for the variable in the main compilation unit. If it isn't found it creates and uses an empty Hash.
So when you try do it in a scope other than the main compilation unit, it isn't in a place where it could be found by that line.
To test if adding that fixes the issue, you can add this to the top of the main compilation unit. (The script that loads the module.)
BEGIN Rakudo::Internals.REGISTER-DYNAMIC: '%*SUB-MAIN-OPTS', {
PROCESS::<%SUB-MAIN-OPTS> := {}
}
Then in the module, write this:
%*SUB-MAIN-OPTS = :named-anywhere;
Or better yet this:
%*SUB-MAIN-OPTS<named-anywhere> = True;
After trying this, it seems to work just fine.
The thing is, that something like that used to be there.
It was removed on the thought that it slows down every Raku program.
Though I think that any slowdown it caused would still be an issue as the line that is still there has to look to see if there is a dynamic variable of that name.
(There are more reasons given, and I frankly disagree with all of them.)
May a cuppa bring enlightenment to future SO readers pondering the meaning of things.[1]
Related answers by Liz
I think Liz's answer to an SO asking a similar question may be a good read for a basic explanation of why a my (which is like a lesser our) in the mainline of a module doesn't work, or at least confirmation that core devs know about it.
Her later answer to another SO explains how one can use my by putting it inside a RUN-MAIN.
Why does a slurpy array work by default but not named anywhere?
One rich resource on why things are the way they are is the section Declaring a MAIN subroutine of S06 (Synopsis on Subroutines)[2].
A key excerpt:
As usual, switches are assumed to be first, and everything after the first non-switch, or any switches after a --, are treated as positionals or go into the slurpy array (even if they look like switches).
So it looks like this is where the default behavior, in which nameds can't go anywhere, comes from; it seems that #Larry[3] was claiming that the "usual" shell convention was as described, and implicitly arguing that this should dictate that the default behavior was as it is.
Since Raku was officially released RFC: Allow subcommands in MAIN put us on the path to todays' :named-anywhere option. The RFC presented a very powerful 1-2 punch -- an unimpeachable two line hackers' prose/data argument that quickly led to rough consensus, with a working code PR with this commit message:
Allow --named-switches anywhere in command line.
Raku was GNU-like in that it has '--double-dashes' and that it stops interpreting named parameters when it encounters '--', but unlike GNU-like parsing, it also stopped interpreting named parameters when encountering any positional argument. This patch makes it a bit more GNU-like by allowing named arguments after a positional, to prepare for allowing subcommands.
> Alter how arguments are processed before they're passed to sub MAIN
In the above linked section of S06 #Larry also wrote:
Ordinarily a top-level Raku "script" just evaluates its anonymous mainline code and exits. During the mainline code, the program's arguments are available in raw form from the #*ARGS array.
The point here being that you can preprocess #*ARGS before they're passed to MAIN.
Continuing:
At the end of the mainline code, however, a MAIN subroutine will be called with whatever command-line arguments remain in #*ARGS.
Note that, as explained by Liz, Raku now has a RUN-MAIN routine that's called prior to calling MAIN.
Then comes the standard argument processing (alterable by using standard options, of which there's currently only the :named-anywhere one, or userland modules such as SuperMAIN which add in various other features).
And finally #Larry notes that:
Other [command line parsing] policies may easily be introduced by calling MAIN explicitly. For instance, you can parse your arguments with a grammar and pass the resulting Match object as a Capture to MAIN.
A doc fix?
Yesterday you wrote a comment suggesting a doc fix.
I now see that we (collectively) know about the coding issue. So why is the doc as it is? I think the combination of your SO and the prior ones provide enough anecdata to support at least considering filing a doc issue to the contrary. Then again Liz hints in one of the SO's that a fix might be coming, at least for ours. And SO is itself arguably doc. So maybe it's better to wait? I'll punt and let you decide. At least you now have several SOs to quote if you decide to file a doc issue.
Footnotes
[1] I want to be clear that if anyone perceives any fault associated with posting this SO then they're right, and the fault is entirely mine. I mentioned to #acw that I'd already done a search so they could quite reasonably have concluded there was no point in them doing one as well. So, mea culpa, bad coffee inspired puns included. (Bad puns, not bad coffee.)
[2] Imo these old historical speculative design docs are worth reading and rereading as you get to know Raku, despite them being obsolete in parts.
[3] #Larry emerged in Raku culture as a fun and convenient shorthand for Larry Wall et al, the Raku language team led by Larry.

Write a compiler for a language that looks ahead and multiple files?

In my language I can use a class variable in my method when the definition appears below the method. It can also call methods below my method and etc. There are no 'headers'. Take this C# example.
class A
{
public void callMethods() { print(); B b; b.notYetSeen();
public void print() { Console.Write("v = {0}", v); }
int v=9;
}
class B
{
public void notYetSeen() { Console.Write("notYetSeen()\n"); }
}
How should I compile that? what i was thinking is:
pass1: convert everything to an AST
pass2: go through all classes and build a list of define classes/variable/etc
pass3: go through code and check if there's any errors such as undefined variable, wrong use etc and create my output
But it seems like for this to work I have to do pass 1 and 2 for ALL files before doing pass3. Also it feels like a lot of work to do until I find a syntax error (other than the obvious that can be done at parse time such as forgetting to close a brace or writing 0xLETTERS instead of a hex value). My gut says there is some other way.
Note: I am using bison/flex to generate my compiler.
My understanding of languages that handle forward references is that they typically just use the first pass to build a list of valid names. Something along the lines of just putting an entry in a table (without filling out the definition) so you have something to point to later when you do your real pass to generate the definitions.
If you try to actually build full definitions as you go, you would end up having to rescan repatedly, each time saving any references to undefined things until the next pass. Even that would fail if there are circular references.
I would go through on pass one and collect all of your class/method/field names and types, ignoring the method bodies. Then in pass two check the method bodies only.
I don't know that there can be any other way than traversing all the files in the source.
I think that you can get it down to two passes - on the first pass, build the AST and whenever you find a variable name, add it to a list that contains that blocks' symbols (it would probably be useful to add that list to the corresponding scope in the tree). Step two is to linearly traverse the tree and make sure that each symbol used references a symbol in that scope or a scope above it.
My description is oversimplified but the basic answer is -- lookahead requires at least two passes.
The usual approach is to save B as "unknown". It's probably some kind of type (because of the place where you encountered it). So you can just reserve the memory (a pointer) for it even though you have no idea what it really is.
For the method call, you can't do much. In a dynamic language, you'd just save the name of the method somewhere and check whether it exists at runtime. In a static language, you can save it in under "unknown methods" somewhere in your compiler along with the unknown type B. Since method calls eventually translate to a memory address, you can again reserve the memory.
Then, when you encounter B and the method, you can clear up your unknowns. Since you know a bit about them, you can say whether they behave like they should or if the first usage is now a syntax error.
So you don't have to read all files twice but it surely makes things more simple.
Alternatively, you can generate these header files as you encounter the sources and save them somewhere where you can find them again. This way, you can speed up the compilation (since you won't have to consider unchanged files in the next compilation run).
Lastly, if you write a new language, you shouldn't use bison and flex anymore. There are much better tools by now. ANTLR, for example, can produce a parser that can recover after an error, so you can still parse the whole file. Or check this Wikipedia article for more options.

Writing a TemplateLanguage/VewEngine

Aside from getting any real work done, I have an itch. My itch is to write a view engine that closely mimics a template system from another language (Template Toolkit/Perl). This is one of those if I had time/do it to learn something new kind of projects.
I've spent time looking at CoCo/R and ANTLR, and honestly, it makes my brain hurt, but some of CoCo/R is sinking in. Unfortunately, most of the examples are about creating a compiler that reads source code, but none seem to cover how to create a processor for templates.
Yes, those are the same thing, but I can't wrap my head around how to define the language for templates where most of the source is the html, rather than actual code being parsed and run.
Are there any good beginner resources out there for this kind of thing? I've taken a ganer at Spark, which didn't appear to have the grammar in the repo.
Maybe that is overkill, and one could just test-replace template syntax with c# in the file and compile it. http://msdn.microsoft.com/en-us/magazine/cc136756.aspx#S2
If you were in my shoes and weren't a language creating expert, where would you start?
The Spark grammar is implemented with a kind-of-fluent domain specific language.
It's declared in a few layers. The rules which recognize the html syntax are declared in MarkupGrammar.cs - those are based on grammar rules copied directly from the xml spec.
The markup rules refer to a limited subset of csharp syntax rules declared in CodeGrammar.cs - those are a subset because Spark only needs to recognize enough csharp to adjust single-quotes around strings to double-quotes, match curley braces, etc.
The individual rules themselves are of type ParseAction<TValue> delegate which accept a Position and return a ParseResult. The ParseResult is a simple class which contains the TValue data item parsed by the action and a new Position instance which has been advanced past the content which produced the TValue.
That isn't very useful on it's own until you introduce a small number of operators, as described in Parsing expression grammar, which can combine single parse actions to build very detailed and robust expressions about the shape of different syntax constructs.
The technique of using a delegate as a parse action came from a Luke H's blog post Monadic Parser Combinators using C# 3.0. I also wrote a post about Creating a Domain Specific Language for Parsing.
It's also entirely possible, if you like, to reference the Spark.dll assembly and inherit a class from the base CharGrammar to create an entirely new grammar for a particular syntax. It's probably the quickest way to start experimenting with this technique, and an example of that can be found in CharGrammarTester.cs.
Step 1. Use regular expressions (regexp substitution) to split your input template string to a token list, for example, split
hel<b>lo[if foo]bar is [bar].[else]baz[end]world</b>!
to
write('hel<b>lo')
if('foo')
write('bar is')
substitute('bar')
write('.')
else()
write('baz')
end()
write('world</b>!')
Step 2. Convert your token list to a syntax tree:
* Sequence
** Write
*** ('hel<b>lo')
** If
*** ('foo')
*** Sequence
**** Write
***** ('bar is')
**** Substitute
***** ('bar')
**** Write
***** ('.')
*** Write
**** ('baz')
** Write
*** ('world</b>!')
class Instruction {
}
class Write : Instruction {
string text;
}
class Substitute : Instruction {
string varname;
}
class Sequence : Instruction {
Instruction[] items;
}
class If : Instruction {
string condition;
Instruction then;
Instruction else;
}
Step 3. Write a recursive function (called the interpreter), which can walk your tree and execute the instructions there.
Another, alternative approach (instead of steps 1--3) if your language supports eval() (such as Perl, Python, Ruby): use a regexp substitution to convert the template to an eval()-able string in the host language, and run eval() to instantiate the template.
There are sooo many thing to do. But it does work for on simple GET statement plus a test. That's a start.
http://github.com/claco/tt.net/
In the end, I already had too much time in ANTLR to give loudejs' method a go. I wanted to spend a little more time on the whole process rather than the parser/lexer. Maybe in version 2 I can have a go at the Spark way when my brain understands things a little more.
Vici Parser (formerly known as LazyParser.NET) is an open-source tokenizer/template parser/expression parser which can help you get started.
If it's not what you're looking for, then you may get some ideas by looking at the source code.