Dealing with too many terminal nodes in grammar - antlr

I'm trying to write a parser for protobuf3 using the grammars from https://github.com/antlr/grammars-v4/blob/master/protobuf3/Protobuf3.g4.
and I'm trying to deal with the _type declaration in my grammar:
field
: ( REPEATED )? type_ fieldName EQ fieldNumber ( LB fieldOptions RB )? SEMI
;
type_
: DOUBLE
| FLOAT
| INT32
| INT64
| UINT32
| UINT64
| SINT32
| SINT64
| FIXED32
| FIXED64
| SFIXED32
| SFIXED64
| BOOL
| STRING
| BYTES
| messageDefinition
| enumType
;
Inside enterField I have this snippet:
#Override
public void enterField(Protobuf3Parser.FieldContext ctx) {
MessageDefinition messageDefinition = this.messageStack.peek();
Field field = new Field();
field.setName(ctx.fieldName().ident().getText());
field.setPosition(ctx.fieldNumber().getAltNumber());
messageDefinition.addField(field);
super.enterField(ctx);
}
However I'm not sure on how I can deal with the type_ context here. It has too many terminal nodes (for basic types) and it could have a messageType or an enumType.
For my use case all I care about is if it is a basic type (and in that case get the type name) or if it is a complex type (such as another message or enum) get the definition name.
Is there a way to do this without having to check each possible outcome of ctx.field_() ?
Thank you

If both, messageDefinition and enumType return single lexer token, you can make the entire access very easy by using a label:
type_
: value = DOUBLE
| value = FLOAT
| value = INT32
| value = INT64
| value = UINT32
| value = UINT64
| value = SINT32
| value = SINT64
| value = FIXED32
| value = FIXED64
| value = SFIXED32
| value = SFIXED64
| value = BOOL
| value = STRING
| value = BYTES
| value = messageDefinition
| value = enumType
;
With that you only need to use the field value:
#Override
public void enterField(Protobuf3Parser.FieldContext ctx) {
...
const type = ctx.type_().value.getText();
...
super.enterField(ctx);
}

Related

Returning a boxed iterator over a value in a structure [duplicate]

This question already has answers here:
Why is adding a lifetime to a trait with the plus operator (Iterator<Item = &Foo> + 'a) needed?
(1 answer)
What is the correct way to return an Iterator (or any other trait)?
(2 answers)
Closed 5 years ago.
I want a function on a struct which returns an iterator to one of the struct's members. I tried something like the following, which doesn't work.
struct Foo {
bar: Vec<String>,
}
impl Foo {
fn baz(&self) -> Box<Iterator<Item = &String>> {
Box::new(self.bar.iter())
}
}
fn main() {
let x = Foo { bar: vec!["quux".to_string()] };
x.baz();
}
When compiling, I get the error below:
error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in function call due to conflicting requirements
--> src/main.rs:7:27
|
7 | Box::new(self.bar.iter())
| ^^^^
|
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined on the method body at 6:5...
--> src/main.rs:6:5
|
6 | / fn baz(&self) -> Box<Iterator<Item = &String>> {
7 | | Box::new(self.bar.iter())
8 | | }
| |_____^
note: ...so that reference does not outlive borrowed content
--> src/main.rs:7:18
|
7 | Box::new(self.bar.iter())
| ^^^^^^^^
= note: but, the lifetime must be valid for the static lifetime...
note: ...so that expression is assignable (expected std::boxed::Box<std::iter::Iterator<Item=&std::string::String> + 'static>, found std::boxed::Box<std::iter::Iterator<Item=&std::string::String>>)
--> src/main.rs:7:9
|
7 | Box::new(self.bar.iter())
| ^^^^^^^^^^^^^^^^^^^^^^^^^
I also tried adding lifetime information as suggested elsewhere
struct Foo {
bar: Vec<String>,
}
impl Foo {
fn baz<'a>(&self) -> Box<Iterator<Item = &String> + 'a> {
Box::new(self.bar.iter())
}
}
fn main() {
let x = Foo { bar: vec!["quux".to_string()] };
x.baz();
}
Now my error is the following:
error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in function call due to conflicting requirements
--> src/main.rs:7:27
|
7 | Box::new(self.bar.iter())
| ^^^^
|
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined on the method body at 6:5...
--> src/main.rs:6:5
|
6 | / fn baz<'a>(&self) -> Box<Iterator<Item = &String> + 'a> {
7 | | Box::new(self.bar.iter())
8 | | }
| |_____^
note: ...so that reference does not outlive borrowed content
--> src/main.rs:7:18
|
7 | Box::new(self.bar.iter())
| ^^^^^^^^
note: but, the lifetime must be valid for the lifetime 'a as defined on the method body at 6:5...
--> src/main.rs:6:5
|
6 | / fn baz<'a>(&self) -> Box<Iterator<Item = &String> + 'a> {
7 | | Box::new(self.bar.iter())
8 | | }
| |_____^
note: ...so that expression is assignable (expected std::boxed::Box<std::iter::Iterator<Item=&std::string::String> + 'a>, found std::boxed::Box<std::iter::Iterator<Item=&std::string::String>>)
--> src/main.rs:7:9
|
7 | Box::new(self.bar.iter())
| ^^^^^^^^^^^^^^^^^^^^^^^^^

antlr how to get a smaller tree in this case

I am printing out the type field and text field from an AST tree based on my grammar and I get this
type=5 text=and
type=14 text==
type=4 text=ALIAS
type=20 text=a
type=7 text=ATTR_NAME
type=20 text=column_b
type=36 text=STR_VAL
type=35 text="asdfds"
type=14 text==
type=4 text=ALIAS
type=20 text=a
type=7 text=ATTR_NAME
type=20 text=yyyy
type=12 text=DEC_VAL
type=11 text=564.555
Valid type ints in my generated lexer are
public static final int EOF=-1;
public static final int ALIAS=4;
public static final int AND=5;
public static final int ATTR_NAME=7;
public static final int DECIMAL=11;
public static final int DEC_VAL=12;
public static final int EQ=14;
public static final int ID=20;
public static final int STR_VAL=36;
I would very much like to NOT have type=20 ever in the tree!!! and instead move the nodes with type 20 up one level so text would be the information(not the token name) and the type would be 4,7, or ALIAS or ATTR_NAME types. Is there a way to do this?
This part of my current grammar is using imaginary tokens ATTR_NAME and ALIAS right now like so(comment if I need to put more of my grammar up but I think this is enough to solve it)
primaryExpr
: compExpr
| inExpr
| parameterExpr
| attribute
;
parameterExpr
: attribute (EQ | NE | GT | LT | GE | LE)^ parameter
| aliasdAttribute (EQ | NE | GT | LT | GE | LE)^parameter
;
compExpr
: attribute (EQ | NE | GT | LT | GE | LE)^ value
| aliasdAttribute(EQ | NE | GT | LT | GE | LE)^value
;
alias
: ID
;
inExpr : attribute IN^ valueList
;
attribute: ID -> ^(ATTR_NAME ID);
aliasdAttribute
: alias(DOT)(ID) -> ^(ALIAS alias ) ^(ATTR_NAME ID)
;
Is there a way to do this?
Sure.
The alias rule in grammar T:
grammar T;
options {
output=AST;
}
tokens {
ALIAS;
}
alias
: ID -> ALIAS[$ID.text]
;
ID : ('a'..'z' | 'A'..'Z')+;
will always produce (rewrite) a token with type ALIAS, but with inner text the same as the ID token.

How add setter to to discriminated unions in F#

I want add setter property to discriminated unions, how I should to do it?
f.e.:
type Factor =
| Value of Object
| Range of String
let mutable myProperty = 123
member this.MyProperty
with get() = myProperty
and set(value) = myProperty <- value
Here's how I might approach it:
type Value = { value: obj; mutable MyProperty: int }
type Range = { range: string; mutable MyProperty: int }
type Factor =
| Value of Value
| Range of Range
member this.MyProperty
with get() =
match this with
| Value { MyProperty=myProperty }
| Range { MyProperty=myProperty } -> myProperty
and set(myProperty) =
match this with
| Value x -> x.MyProperty <- myProperty
| Range x -> x.MyProperty <- myProperty
and use it like so:
let v = Value {value="hi":>obj ; MyProperty=0 }
v.MyProperty <- 2
match v with
| Value { value=value } as record ->
printfn "Value of value=%A with MyProperty=%i" value record.MyProperty
| _ ->
printfn "etc."
I've used this technique in a similar scenario to yours with happy results in FsEye's watch model: http://code.google.com/p/fseye/source/browse/tags/2.0.0-beta1/FsEye/WatchModel.fs.
Why not use a class and an active pattern:
type _Factor =
| Value_ of obj
| Range_ of string
type Factor(arg:_Factor) =
let mutable myProperty = 123
member this._DU = arg
member this.MyProperty
with get() = myProperty
and set(value) = myProperty <- value
let (|Value|Range|) (arg:Factor) =
match arg._DU with
|Value_(t) -> Value(t)
|Range_(t) -> Range(t)
This will obviously be significantly slower, but it allows you to do what you want
I'm not too familiar with F# yet, but I suppose you can't do this, it doesn't make any sense. Discriminated Unions as it can be seen from their name are unions. They represent some kind of a choice. And you're trying to incorporate some state into it. What're you trying to achieve? What's the use case?
Perhaps everything you need is to add additional "parameter" to your DU, i.e. if you have
type DU =
| A of int
| B of string
and you want to add setter of int type, then you can extend DU in such a way:
type DU =
| A of int * int
| B of string * int
member x.Set i =
match x with
| A(a1, a2) -> A(a1, i)
| B(b1, b2) -> B(b1, i)

OCaml: circularity between variant type and module definition

I'm switching from Haskell to OCaml but I'm having some problems. For instance, I need a type definition for regular expressions. I do so with:
type re = EmptySet
| EmptyWord
| Symb of char
| Star of re
| Conc of re list
| Or of (RegExpSet.t * bool) ;;
The elements inside the Or are in a set (RegExpSet), so I define it next (and also a map function):
module RegExpOrder : Set.OrderedType =
struct
let compare = Pervasives.compare
type t = re
end
module RegExpSet = Set.Make( RegExpOrder )
module RegExpMap = Map.Make( RegExpOrder )
However, when I do "ocaml [name of file]" I get:
Error: Unbound module RegExpSet
in the line of "Or" in the definition of "re".
If I swap these definitions, that is, if I write the modules definitions before the re type definitions I obviously get:
Error: Unbound type constructor re
in the line of "type t = re".
How can I solve this?
Thanks!
You can try to use recursive modules. For instance, the following compiles:
module rec M :
sig type re = EmptySet
| EmptyWord
| Symb of char
| Star of re
| Conc of re list
| Or of (RegExpSet.t * bool)
end =
struct
type re = EmptySet
| EmptyWord
| Symb of char
| Star of re
| Conc of re list
| Or of (RegExpSet.t * bool) ;;
end
and RegExpOrder : Set.OrderedType =
struct
let compare = Pervasives.compare
type t = M.re
end
and RegExpSet : (Set.S with type elt = M.re) = Set.Make( RegExpOrder )

Recursive Set in OCaml

how can I manage to define a Set in OCaml that can contains element of its type too?
To explain the problem I have a type declaration for a lot of data types like
type value =
Nil
| Int of int
| Float of float
| Complex of Complex.t
| String of string
| Regexp of regexp
| Char of char
| Bool of bool
| Range of (int*int) list
| Tuple of value array
| Lambda of code
| Set of ValueSet.t (* this isn't allowed in my case since module is declared later*)
In addition I declare a concrete module for ValueSet later in the same file:
module ValueSet = Set.Make(struct type t = value let compare = Pervasives.compare end)
The problem is that ValueSet has value as it's elt type but value can be a ValueSet so I'm getting troubles while trying to compile it.
All of these declarations are contained in just a file named types.ml (that has it's own interface types.mli but without any ValueSet module decl since I'm not either sure it's possible).
Can this problem be solved in some way?
You can use recursive modules. Language manual uses precisely the same example of recursive set type to illustrate this language feature. Below is a relevant excerpt.
A typical example of a recursive module definition is:
module rec A : sig
type t = Leaf of string | Node of ASet.t
val compare: t -> t -> int
end
= struct
type t = Leaf of string | Node of ASet.t
let compare t1 t2 =
match (t1, t2) with
(Leaf s1, Leaf s2) -> Pervasives.compare s1 s2
| (Leaf _, Node _) -> 1
| (Node _, Leaf _) -> -1
| (Node n1, Node n2) -> ASet.compare n1 n2
end
and ASet : Set.S with type elt = A.t
= Set.Make(A)
It can be given the following specification:
module rec A : sig
type t = Leaf of string | Node of ASet.t
val compare: t -> t -> int
end
and ASet : Set.S with type elt = A.t