K Framework: Substitution not substituting in simple terms? - kframework

I have the following K file:
require "substitution.k"
module PURE
imports DOMAINS
imports SUBSTITUTION
syntax PSort ::= "$Type" [token]
| "$Kind" [token]
syntax Type ::= PSort
| KVar
| "Pi" KVar ":" Term "." Term [binder]
syntax Term ::= Type
| "(" Term ")" [bracket]
> Term Term [left]
> "declare" KVar ":" Term "in" Term
syntax KResult ::= Type
configuration
<T>
<k> typeof($PGM:Term, ?T) ~> ?T </k>
<typeEnv> .Map </typeEnv>
</T>
syntax KItem ::= typeof(Term, Term)
rule <k> typeof(declare X : T in E, T2) => typeof(E, T2) ... </k>
<typeEnv> TEnv => TEnv[X <- T] </typeEnv>
// VAR
rule <k> typeof(X:KVar, T) => . ... </k>
<typeEnv> ... X |-> T ... </typeEnv>
// APP
syntax KItem ::= Term "=" Term
rule T = T => .
rule typeof(M N, T) =>
typeof(M, Pi ?X : ?T1. ?T2) ~>
typeof(N, ?T1) ~>
?T2[N/?X] = T
endmodule
When I compile it with the Java backend and run the following file:
declare nat : $Type in
declare Z : nat in
declare Vector : Pi n : nat . $Type in
declare blah : Pi n : nat . (Vector n) in
blah Z
I get:
<T>
<k>
Vector n
</k>
<typeEnv>
Vector |-> Pi n : nat . $Type
Z |-> nat
blah |-> Pi n : nat . ( Vector n )
nat |-> $Type
</typeEnv>
</T>
But I want it to substitute Z for n and get Vector Z.

This appears to be a bug in the java backend that prematurely applies the substitution operator while its arguments are still symbolic variables. As a result, the substitution operator disappears prematurely, and then when the term that was substituted is instantiated via unification, it has not been substituted, which leads to the problem that you describe. Here is an issue tracking the problem: https://github.com/kframework/k/issues/1165
I took a stab at fixing it, but it proved to be nontrivial and I don't have time to dig deeper right now. You are welcome to try to fix it in a pull request if you want, although I am unsure why the fix I wrote is making other things break. Your better choice is probably to rewrite your typing rules so that they don't try to perform substitution on a variable. One way to do this would be to make the rule for application modify the type environment and then restore it when it's been fully typed. You can take a look at the K tutorial folder 1_k/5_types for some examples of how you can type a lambda-calculus-like language.

Related

K Framework: problem with semantic cast in function declarations

I'm trying to update an erlang semantics from K 3.6 to 5.0 and I ran into the following issue:
When I try to write a function declaration without semantic cast, it works fine:
rule Name:Atom(Args) -> Body . =>. ... [structural]
But when I need to write the following, the kompile outputs [Error] Inner Parser: Parse error: unexpected token ')'.
rule Name:Atom(Args:Values) -> Body => . ... [structural]
To reproduce, here is my simplified syntax:
imports STRING
syntax UnquotedAtom ::= r"[a-z][_a-zA-Z0-9#]*" [token]
syntax Atom ::= UnquotedAtom | Bool
syntax Exp ::= Atom
syntax Exps ::= List{Exp, ","} [strict, klabel("exps"), prefer, listexps]
syntax FunCl ::= Atom"("Exps")" "->" Exps "." [funcl1]
syntax Value ::= Atom
syntax Values ::= List{Value, ","}
syntax Exp ::= Value
syntax KResult ::= Value
// Function declaration
//ok
rule <k>Name:Atom(Args) -> Body . =>. ...</k> [structural]
// unexpected token ')'
rule <k>Name:Atom(Args:Values) -> Body => . ...</k> [structural]
My K version is:
RV-K version 1.0-SNAPSHOT
Git revision: adf2f2d
Git branch: UNKNOWN
Build date: Tue Mar 16 16:43:04 CET 2021
One of the changes from K3 to K5 is that lists are no longer automatically subsorted if the elements are subsorted. If you manually add
syntax Exps ::= Values
Then your rule will kompile again.

When are K configuration cells type-checked?

It is a common K idiom to define a programming language's syntax with a top-sort of well-formed programs (e.g. Pgm) and then to restrict the <k> cell to have this sort in the configuration declaration using the special $PGM variable which is passed automatically by krun. This prevents users from executing programs with krun that are not well-formed. My question is:
Are the sort of cells checked only at start-up time or after each rule evaluation?
Do different cells show different behavior depending on their identity (e.g. the <k> cell) or how they are typed (e.g. user-defined types versus builtin types)?
Here is a partial example to show what I mean:
configuration
<mylang>
<k> $PGM:Pgm </k>
<env> .Env:Env </env> // Env is a custom map structure defined for environments
<store> .Map </store> // For the store we use the K builtin Map
...
</mylang>
For the <k> cell, I conclude that it is definitely only checked at start-up time, since program evaluation typically tears a program apart into an expression and a continuation (e.g. ADD ~> ...) which cannot have the sort Pgm anymore because ~> is builtin.
So, elaborating on questions (1-2) above, is the <k> cell exceptional in this sense?
Each rule is sort-checked at kompile time to be sort-preserving, so it's not needed to check this at runtime. If something of the correct sort goes in, something of the correct sort comes out.
The <k> cell gets sort K, at least for example, in this definition: https://github.com/kframework/evm-semantics/blob/272608d70f363ed3d8d921887b98a26102a03032/evm.md#configuration
it results in compiled.txt (found at .build/defn/java/driver-kompiled/compiled.txt) which looks like:
...
syntax KCell ::= "project:KCell" "(" K ")" [function, projection]
syntax KCell ::= "initKCell" "(" Map ")" [function, initializer, noThread]
syntax KCell ::= "<k>" K "</k>" [cell, cellName(k), contentStartColumn(7), contentStartLine(31), format(%1%i%n%2%d%n%3), maincell, org.kframework.definition.Production(syntax #RuleContent ::= #RuleBody [klabel(#ruleNoConditions), symbol])]
...
But other cells get more specific sorts:
...
syntax JumpDestsCell ::= "project:JumpDestsCell" "(" K ")" [function, projection]
syntax JumpDestsCell ::= "initJumpDestsCell" [function, initializer, noThread]
syntax JumpDestsCell ::= "<jumpDests>" Set "</jumpDests>" [cell, cellName(jumpDests), contentStartColumn(7), contentStartLine(31), format(%1%i%n%2%d%n%3), org.kframework.definition.Production(syntax #RuleContent ::= #RuleBody [klabel(#ruleNoConditions), symbol])]
...
I'm not sure how K decides that the <k> cell needs to get sort K, but I don't think it's based on analyzing the rules. I think it's likely that it sees $PGM in that cell, so it adds the maincell attribute you see and gives it sort K. Everething is a subsort of K.
I'm fairly certain it's not any $ variable in the configuration that gives it sort K, because the <chainID> cell in KEVM gets these declarations:
...
syntax ChainIDCell ::= "project:ChainIDCell" "(" K ")" [function, projection]
syntax ChainIDCell ::= "initChainIDCell" "(" Map ")" [function, initializer, noThread]
syntax ChainIDCell ::= "<chainID>" Int "</chainID>" [cell, cellName(chainID), contentStartColumn(7), contentStartLine(31), format(%1%i%n%2%d%n%3), org.kframework.definition.Production(syntax #RuleContent ::= #RuleBody [klabel(#ruleNoConditions), symbol])]
...
Note that there isn't very much special about the _~>_ operator. It's declared here: https://github.com/kframework/k/blob/135469ea0ebea96dacf0f9a49261ff1171440c20/k-distribution/include/kframework/builtin/kast.k#L57

Formal grammar and arity

I have the following grammar:
S --> LR .
L --> aL .
R --> bR .
This grammar generates the language a^n b^k, where n,k > 0.
I want a grammar that generates the language a^n b^n where n>0, so
my goal is to obtain a grammar in order to ensure that the number of a is always equal of b, but still keeping the non-terminals L and R.
Is there a way to do this?
In a.context free grammar, the derivations of L and R in S → L R are independent of each other. That is what "context free" means: the derivation of a non-terminal is not affected by the context in which the non-terminal occurs.
So if you want a grammar in which L and R must derive strings of equal length, it will have to be a context-sensitive grammar. No context-free grammar can do that.
Of course, there is a simple CFG for the language:
S →
S → a S b

caret prefix instead of postfix in antlr

I know what the caret postfix means in antlr(ie. make root) but what about when the caret is the prefix as in the following grammar I have been reading(this grammar is brand new and done by a new team learning antlr)....
selectClause
: SELECT resultList -> ^(SELECT_CLAUSE resultList)
;
fromClause
: FROM tableList -> ^(FROM_CLAUSE tableList)
;
Also, I know what => means but what about the -> ? What does -> imply?
thanks,
Dean
The ^ is used as an inline tree operator, indicating a certain token should become the root of the tree.
For example, the rule:
p : A B^ C;
creates the following AST:
B
/ \
A C
There's another way to create an AST which is using a rewrite rule. A rewrite rule is placed after (or at the right of) an alternative of a parser rule. You start a rewrite rule with an "arrow", ->, followed by the rules/tokens you want to be in the AST.
Take the previous rule:
p : A B C;
and you want to reverse the tokens, but keep the ASST "flat" (no root node). THis can be done using the following rewrite rule:
p : A B C -> C B A;
And if you want to create an AST similar to p : A B^ C;, you start your rewrite rule with ^( ... ) where the first token/rule inside the parenthesis will become the root node. So the rule:
p : A B C -> ^(B A C);
produces the same AST as p : A B^ C;.
Related:
Tree construction
How to output the AST built using ANTLR?

What does ^ and ! stand for in ANTLR grammar

I was having difficulty figuring out what does ^ and ! stand for in ANTLR grammar terminology.
Have a look at the ANTLR Cheat Sheet:
! don't include in AST
^ make AST root node
And ^ can also be used in rewrite rules: ... -> ^( ... ). For example, the following two parser rules are equivalent:
expression
: A '+'^ A ';'!
;
and:
expression
: A '+' A ';' -> ^('+' A A)
;
Both create the following AST:
+
/ \
A A
In other words: the + is made as root, the two A's its children, and the ; is omitted from the tree.