"Cannot find pattern" when using a tagged union type - elm

I have an app that spans multiple modules. In the first, I model my problem, creating several data types. In the second, I'm putting views.
One of those types is a tagged union type:
type alias Letter = Char
type GuessedLetter = Guessed Letter | Unguessed
In my View module, I have a function for displaying a letter:
guessToChar : GuessedLetter -> Char
guessToChar guess =
case guess of
Guessed l -> l
Unguessed -> '_'
But when I try compiling these files, I get the following error:
## ERRORS in src/Views.elm #####################################################
-- NAMING ERROR -------------------------------------------------- src/Views.elm
Cannot find pattern `Guessed`
21| Guessed l -> l
^^^^^^^^^
-- NAMING ERROR -------------------------------------------------- src/Views.elm
Cannot find pattern `Unguessed`
22| Unguessed -> '_'
^^^^^^^^^
Detected errors in 1 module.
I thought "Maybe I should export the tags as well as the type?", but neither adding the tags to the module exports nor attempting to fully-qualify the tags (GuessedLetter.Guessed) has resolved the issue.
How do I fix this function?

As I suspected, if you want to use the tags outside the module, you have to export them too. (I just wan't sure how).
To do that, add the tags in a comma-separated list inside parentheses.
From the source code for Maybe (a type that 'worked' the way I wanted mine to):
module Maybe exposing
( Maybe(Just,Nothing)
, andThen
, map, map2, map3, map4, map5
, withDefault
, oneOf
)
Or in my case:
module Game exposing (Letter, GuessedLetter(Guessed, Unguessed))
On the importing side, you can then choose to fully-qualify the tags (with the module, not the type):
import Game exposing GuessedLetter
{- ... -}
guessToChar : GuessedLetter -> Char
guessToChar guess =
case guess of
Game.Guessed l -> l
Game.Unguessed -> '_'
or expose the tags too:
import Game exposing GuessedLetter(Guessed, Unguessed)
{- ... -}
guessToChar : GuessedLetter -> Char
guessToChar guess =
case guess of
Guessed l -> l
Unguessed -> '_'

Related

How to define Pascal variables in PetitParser

Here is the (simplified) EBNF section I'm trying to implement in PetitParser:
variable :: component / identifier
component :: indexed / field
indexed :: variable , $[ , blah , $]
field :: variable , $. , identifier
What I did was to add all these productions (except identifier) as ivars of my subclass of PPCompositeParser and define the corresponding methods as follows:
variable
^component / self identifier
component
^indexed / field
identifier
^(#letter asParser, (#word asParser) star) flatten
indexed
^variable , $[ asParser, #digit asParser, $] asParser
field
^variable , $. asParser, self identifier
start
^variable
Finally, I created a new instance of my parser and sent to it the message parse: 'a.b[0]'.
The problem: I get a stack overflow.
The grammar has a left recursion: variable -> component -> indexed -> variable. PetitParser uses Parsing Expression Grammars (PEGs) that cannot handle left recursion. A PEG parser always takes the left option until it finds a match. In this case it will not find a match due to the left recursion. To make it work you need to first eliminate left recursion. Eliminating all left recursion could be more tricky as you will also get one through field after eliminating the first. For example, you can write the grammar as follows to make the left recursion more obvious:
variable = (variable , $[ , blah , $]) | (variable , $. , identifier) | identifier
If you have a left recursion like:
A -> A a | b
you can eliminate it like (e is an empty parser)
A -> b A'
A' -> a A' | e
You'll need to apply this twice to get rid of the recursion.
Alternatively you can choose to simplify the grammar if you do not want to parse all possible combinations of identifiers.
The problem is that your grammar is left recursive. PetitParser uses a top-down greedy algorithm to parse the input string. If you follow the steps, you'll see that it goes from start then variable -> component -> indexed -> variable. This is becomes a loop that gets executed infinitely without consuming any input, and is the reason of the stack overflow (that is the left-recursiveness in practice).
The trick to solve the situation is to rewrite the parser by adding intermediate steps to avoid left-recursing. The basic idea is that the rewritten version will consume at least one character in each cycle. Let's start by simplifying a bit the parser refactoring the non-recursive parts of ´indexed´ and ´field´, and moving them to the bottom.
variable
^component, self identifier
component
^indexed / field
indexed
^variable, subscript
field
^variable, fieldName
start
^variable
subscript
^$[ asParser, #digit asParser, $] asParser
fieldName
^$. asParser, self identifier
identifier
^(#letter asParser, (#word asParser) star) flatten
Now you can more easily see (by following the loop) that if the recursion in variable is to end, an identifier has to be found at the beginning. That's the only way to start, and then comes more input (or ends). Let's call that second part variable':
variable
^self identifier, variable'
now the variable' actually refers to something with the identifier consumed, and we can safely move the recusion from the left of indexed and field to the right in variable':
variable'
component', variable' / nil asParser
component'
^indexed' / field'
indexed'
^subscript
field'
^fieldName
I've written this answer without actually testing the code, but should be okish. The parser can be further simplified, I leave that as an excercise ;).
For more information on left-recursion elimination you can have a look at left recursion elimination

Issues with walking a anltr parse tree using a listener

I have been working with Antlr, trying to parse and store the (grammars in antlr ) .g4 files in a data structure, so that i can be able to mutate over the rules, and then run antlr on garmmars with mutated rules. I have ANTLRv4Parser grammars, I am trying to write a listener that walks down the tree storing the tokens. However, doing that worked but for rules with alternatives the pipe "|" symbol appears to be off. This comes from the following rule in the antlrv4parser grammar, ruleAltList : alternative (OR alternative)* . So it seems i'm struggling to get tokens from child nodes of alternative before the pipe and then after the pipe in enterRuleAltList in my listener, it seems antlr does the preorder traversal so it gets the pipe before heading down to alternative.
so what i want is maybe using the same listener patterns in antlr with some sort of an inorder traversal.
here's snippet from antlrv4parser grammar
ruleAltList: labeledAlt (OR labeledAlt)* ;
the anltrv4parser grammar and other grammars can be found on this link https://github.com/antlr/grammars-v4/tree/master/antlr4
For example if i have the following grammar
grammar c;
A : B | C;
I want to be able to store in a data structure as
["A", ":", "B","|","C",";"]
what i get is
["A", ":", "|","B", "C",";"]
So any ideas on how to override the enterRuleAltList method in my listener to have tokens from alternative child node before the OR, which is "|"?
Reduced representation of the grammar:
parserRuleSpec
: RULE_REF COLON ruleBlock SEMI
;
ruleBlock
: ruleAltList
;
ruleAltList
: labeledAlt (OR labeledAlt)*
;
labeledAlt
: terminal
;
Collecting all terminals of the nodes as encountered during the walk should result in an ordering ["A", ":", ";", "|", "B", "C"]. (Post the actual full listener code if what was originally given was not a typo.)
enterParserRuleSpec -> A : ;
enterRuleBlock
enterRuleAltList -> |
enterLabeledAlt
enterTerminal -> B
enterLabeledAlt
enterTerminal -> C
When collecting the terminals, attention must be paid to their order in the list of context children relative to their sibling non-terminals.
Or, possibly, just collect the terminals into a list sorted by token index.

Recursive rule in ANTLR

I need an idea how to express a statement like the following:
Int<Double<Float>>
So, in an abstract form we should have:
1.(easiest case): a<b>
2. case: a<a<b>>
3. case: a<a<a<b>>>
4. ....and so on...
The thing is that I should enable the possibility to embed a statement of the form a < b > within the < .. > - signs such that I have a nested statement. In other words: I should replace the b with a< b >.
The 2nd thing is that the number of the opening and closed <>-signs should be equal.
How can I do that in ANTLR ?
A rule can refer to itself without any problem¹. Let's say we have a rule type which describes your case, in a minimalist approach:
type: typeLiteral ('<' type '>')?;
typeLiteral: 'Int' | 'Double' | 'Float';
Note the ('<' type '>') is optional, denoted by the ? symbol, so using only a typeLiteral is a valid type. Here are the synta trees generated by these rules in your example Int<Double<Float>>:
¹: As long some terminals (like '<' or '>') can diferentiate when the recursion stop.
Image generated by http://ironcreek.net/phpsyntaxtree/

Not able to understand ANTLR parser rule

I am new to ANTLR , and walking through existing grammar(got at Internet). See the given rule , i am not able to understand it for what is it all about?
Specially $model_expr inside Tree construct and initial (unary_expr -> unary_expr). Please help me understanding the same.
model_expr
: (unary_expr -> unary_expr)
(LEFT_BRACKET model_expr_element RIGHT_BRACKET
-> ^(MODEL_EXPR[$LEFT_BRACKET] $model_expr model_expr_element))?
;
Thanks
See the detailed explanation of above syntax with example (copied from book)
Referencing Previous Rule ASTs in Rewrite Rules
Sometimes you can’t build the proper AST in a purely declarative manner. In other words, executing a single rewrite after the parser has matched everything in a rule is insufficient. Sometimes you need to iteratively build up the AST . To iteratively build an AST, you need to be able to reference the previous value of the current rule’s AST. You can reference the previous value by using $r within a rewrite rule where r
is the enclosing rule. For example, the following rule matches either a single integer or a series of integers addedtogether:
expr : (INT -> INT) ( '+' i=INT -> ^( '+' $expr $i) ) * ;
The (INT->INT) subrule looks odd but makes sense. It says to match INT
and then make its AST node the result of expr . This sets a result AST
in case the (...)* subrule that follows matches nothing. To add another
integer to an existing AST, you need to make a new ’+’ root node that
has the previous expression as the left child and the new integer as the
right child.
That grammar with embedded rewrite rules recognizes the same input and generates the same tree as the following version that uses the construction operators:
expr : INT ('+'^ INT)*;

Issues of Error handling with ANTLR3

I tried error reporting in following manner.
#members{
public String getErrorMessage(RecognitionException e,String[] tokenNames)
{
List stack=getRuleInvocationStack(e,this.getClass().getName());
String msg=null;
if(e instanceof NoViableAltException){
<some code>
}
else{
msg=super.getErrorMessage(e,tokenNames);
}
String[] inputLines = e.input.toString().split("\r\n");
String line = "";
if(e.token.getCharPositionInLine()==0)
line = "at \"" + inputLines[e.token.getLine() - 2];
else if(e.token.getCharPositionInLine()>0)
line = "at \"" + inputLines[e.token.getLine() - 1];
return ": " + msg.split("at")[0] + line + "\" => [" + stack.get(stack.size() - 1) + "]";
}
public String getTokenErrorDisplay(Token t){
return t.toString();
}
}
And now errors are displayed as follows.
line 6:7 : missing CLOSSB at "int a[6;" => [var_declaration]
line 8:0 : missing SEMICOL at "int p" => [var_declaration]
line 8:5 : missing CLOSB at "get(2;" => [call]
I have 2 questions.
1) Is there a proper way to do the same thing I have done?
2) I want to replace CLOSSB, SEMICOL, CLOSB etc. with their real symbols. How can I do that using the map in .g file?
Thank you.
1) Is there a proper way to do the same thing I have done?
I don't know if there is a defined proper way of showing errors. My take on showing errors is a litmis test. If the user can figure out how to fix the error based on what you have given them then it is good. If the user is confued by the error message then the message needs more work. Based on the examples given in the question, symbols were only char constants.
My favorite way of seeing errors is with the line with an arrow pointing at the location.
i.e.
Expected closing brace on line 6.
int a[6;
^
2) I want to replace CLOSSB, SEMICOL, CLOSB etc. with their real symbols. How can I do that using the map in .g file?
You will have to read the separately generated token file and then make a map, i.e. a dictionary data structure, to translate the token name into the token character(s).
EDIT
First we have to clarify what is meant by symbol. If you limit the definition of symbol to only tokens that are defined in the tokens file with a char or string then this can be done, i.e. '!'=13, or 'public'=92, if however you chose to use the definition of symbol to be any text associated with a token, then that is something other than what I was or plan to address.
When ANTLR generates its token map it uses three different sources:
The char or string constants in the lexer
The char or string constants in the parser.
Internal tokens such as Invalid, Down, Up
Since the tokens in the lexer are not the complete set, one should use the tokens file as a starting point. If you look at the tokens file you will note that the lowest value is 4. If you look at the TokenTypes file (This is the C# version name) you will find the remaining defined tokens.
If you find names like T__ in the tokens file, those are the names ANTLR generated for the char/string literals in the parser.
If you are using string and/or char literals in parser rules, then ANTLR must create a new set of lexer rules that include all of the string and/or char literals in the parser rules. Remember that the parser can only see tokens and not raw text. So string and/or char literals cannot be passed to the parser.
To see the new set of lexer rules, use org.antlr.Tool –Xsavelexer, and then open the created grammar file. The name may be like.g . If you have string and/or char literals in your parser rules you will see lexer rules with name starting with T .
Now that you know all of the tokens and their values you can create a mapping table from the info given in the error to the string you want to output instead for the symbol.
The code at http://markmail.org/message/2vtaukxw5kbdnhdv#query:+page:1+mid:2vtaukxw5kbdnhdv+state:results
is an example.
However the mapping of the tokens can change for such things as changing rules in the lexer or changing char/string literals in the parser. So if the message all of a sudden output the wrong string for a symbol you will have to update the mapping table by hand.
While this is not a perfect solution, it is a possible solution depending on how you define symbol.
Note: Last time I looked ANTLR 4.x creates the table automatically for access within the parser because it was such a problem for so many with ANTLR 3.x.
Bhathiya wrote:
*1) Is there a proper way to do the same thing I have done?
There is no single way to do this. Note that proper error-handling and reporting is tricky. Terence Parr spends a whole chapter on this in The Definitive ANTLR Reference (chapter 10). I recommend you get hold of a copy and read it.
Bhathiya wrote:
2) I want to replace CLOSSB, SEMICOL, CLOSB etc. with their real symbols. How can I do that using the map in .g file?
You can't. For SEMICOL this may seem easy to do, but how would you get this information for a token like FOO:
FOO : (X | Y)+;
fragment X : '4'..'6';
fragment Y : 'a' | 'bc' | . ;