I want to translate a given ABNF grammar into a valid ParseKit grammar. Actually I'm trying to find a solution for this kind of statement:
tag = 1*<any Symbol except "C">
with
Symbol = "A" / "B" / "C" / "D" // a lot more symbols here...
The symbol definition is simplified for this question and normally contains a lot of special characters.
My current solution is to hard code all allowed symbols for tag, like
tag = ('A' | 'B' | 'D')+;
But what I really want is something like a "without operator"
tag = Symbol \ 'C';
Is there any construct that allows me to keep my symbol list and define some excludes?
Developer of ParseKit here.
Yes, there is a feature for exactly this. Here's an example:
allItems = 'A' | 'B' | 'C' | 'D';
someItems = allItems - 'C';
Use the - operator.
Related
I would like to make an easier statement instead of 26 other "NOT LIKE" statements, anyone have any idea how to do that ? So I can include all letters of the alphabet instead of just the many individual letters. Thank you.
SELECT *
FROM name
WHERE flag LIKE 'Y'
AND name.autotrackchild IS NULL
AND substring(name.lot,LENGTH(name.lot),length(name.lot)) NOT LIKE 'A'
AND substring(name.lot,LENGTH(name.lot),length(name.lot)) NOT LIKE 'B'
AND substring(name.lot,LENGTH(name.lot),length(name.lot)) NOT LIKE 'C'
--REMOVES CHILD LOTS (ANYTHING WITH A LETTER ON THE END OF IT'S LENGTH)
You would use regular expressions:
where flag like 'Y' and
regexp_like(name.lot, '[^A-Z]$')
The following is sufficient for the goal:
and right(name.lot, 1) not between 'A' and 'Z'
I'm trying to make grammar for the calculator, however it have to be working only for odd numbers.
For example it works like that:
If I put 123 the result is 123.
If I put 1234 the result is 123, and the token recognition error at: 4 but should be at: 1234.
There is my grammar:
grammar G;
DIGIT: ('0'..'9') * ('1' | '3' | '5' | '7'| '9');
operator : ('+' | '-' | '*' | ':');
result: DIGIT operator (DIGIT | result);
I mean specifically to make that, the 1234 should be recognized as an error, not only the last digit.
The way that tokenization works is that it tries to find the longest prefix of the input that matches any of your regular expressions and then produces the appropriate token, consuming that prefix. So when the input is 1234, it sees 123 as the longest prefix that matches the DIGIT pattern (which should really be called ODD_INT or something) and produces the corresponding token. Then it sees the remaining 4 and produces an error because no rule matches it.
Note that it's not necessarily only the last digit that produces the error. For the input 1324, it would produce a DIGIT token for 13 and then a token recognition error for 24.
So how can you get the behaviour that you want? One approach would be to rewrite your pattern to match all sequences of digits and then use a semantic predicate to verify that the number is odd. The way that semantic predicates work on lexer rules is that it first takes the longest prefix that matches the pattern (without taking into account the predicate) and then checks the predicate. If the predicate is false, it moves on to the other patterns - it does not try to match the same pattern to a smaller input to make the predicate return true. So for the input 1234, the pattern would match the entire number and then the predicate would return false. Then it would try the other patterns, none of which match, so you'd get a token recognition error for the full number.
ODD_INT: ('0'..'9') + { Integer.parseInt(getText()) % 2 == 1 }?;
The down side of this approach is that you'll need to write some language-specific code (and if you're not using Java, you'll need to adjust the above code accordingly).
Alternatively, you could just recognize all integers in the lexer - not just odd ones - and then check whether they're odd later during semantic analysis.
If you do want to check the oddness using patterns only, you can also work around the problem by defining rules for both odd and even integers:
ODD_INT: ('0'..'9') * ('1' | '3' | '5' | '7'| '9');
EVEN_INT: ('0'..'9') * ('0' | '2' | '4' | '6'| '8');
This way for an input like 1234, the longest match would always be 1234, not 123. It's just that this would match the EVEN_INT pattern, not ODD_INT. So you wouldn't get a token recognition error, but, if you consistently only use ODD_INT in the grammar, you would get an error saying that an ODD_INT was expected, but an EVEN_INT found.
First of all, apologies if this is a duplicate question. I've done my best to search but was unsuccessful, and I couldn't even properly word my question in terms of keywords!
I need to write a Postgres query that will find all the rows that do not contain any letters but given in regex. I already found out that I'll need to use LIKE statement with regex. But I have no idea how to white proper conditions.
Example: letters are 'A', 'P', 'P', 'L', 'E'.
The query should return
APP
PAL
LAP
APPLE
The query should not return
PELT
PIE
etc...
Any thoughts?
You can use ~ operator for regular expression in postgresql instead of LIKE and your regex should look like this :
SELECT * FROM tablename WHERE col ~ '^[APLE]+$';
wich mean match one or more character from the group [APLE]
So it can return :
APP
PAL
LAP
APPLE
and not :
PELT
PIE
Write a grammar that generates strings that contain matched brackets and parentheses. Examples of valid strings are:
[([])]
()()[[]]
[[]][()]()
Examples of invalid strings are:
[}
[[]
()())
][()
My answer:
< string > -> < term >*
< term > -> (< string >) | [< string >]
If this works the way I think it does than a < string > turns into zero or more terms which are then put in brackets or parenthesis and then filled with zero or more terms again. However I'm not sure about the asterisk and haven't been able to find any examples of someone using it the way I did.
Sorry if I'm way off.
Turns out the answer was:
< S > -> < S >< S > | (< S >) | [< S >] | () | []
mine was "not valid in BNF, no base cases", Oh well.
I have to generate parser of CSV data. Somehow I managed to write BNF, EBNF for CSV data but I don't know how to convert this into an ANTLR grammar (which is a parser generator). For example, in EBNF we write:
[{header entry}newline]newline
but when I write this in ANTLR to generate a parser, it's giving an error and not taking brackets. I am not expert in ANTLR can anyone help?
hi , i have to generate parser of CSV data ...
In most languages I know, there already exists a decent 3rd party CSV parser. So, chances are that you're reinventing the wheel.
For Example in EBNF we wrire [{header entry}newline]newline
The equivalent in ANTLR would look like this:
((header entry)* newline)? newline
In other words:
| (E)BNF | ANTLR
-----------------+--------+------
'a' zero or once | [a] | a?
'a' zero or more | {a} | a*
'a' once or more | a {a} | a+
Note that you can group rules by using parenthesis (sub-rules is what they're called):
'a' 'b'+
matches: ab, abb, abbb, ..., while:
('a' 'b')+
matches: ab, abab, ababab, ...