Does exists one algorithmic to convert a Linear Grammar Right to a Linear Grammar Left? - grammar

Does exists one algorithmic to convert a Linear Grammar Right to the equal Linear Grammar Left?

For every right-linear grammar, there exists an equivalent left-linear grammar that generates the same language, and vice-versa.
Use the grammar to build the FSA that recognizes the language generated by the original grammar.
Swap initial states with final states.
Invert arrows orientation.
If multiple initial states are present, set them as not initial, create a dummy initial state and link it with them using spontaneous moves.
From the modified FSA, obtain another right-linear grammar, using the "standard" approach.
Reverse the right side of every production of the grammar.
You should get an equivalent left-linear grammar.

Related

Regular grammars

In a regular grammar with the following rules
S->aS/aSbS/ε is it acceptable to do the following steps:
S->aSbS->a{aSbS}bS->aa{aSbS}bSbS->aaa{aSbS}bSbSbS
Do i have to replace every S in each step or i can replace one S out of two for example? In this : aSbS can i do aSb (following the rule S->ε) and if i cannot
should i replace all the S's with the same rule? (aSbS)-> a(aS)b(aS) (following the rule (S->aS) ) or i can do (aSbS->a(aS)b(aSbS)) .
ps. I use the parentheses do indicate which S's i have replaced
In a formal grammar, a derivation step consists of replacing a single instance of the left-hand side of a rule with the right-hand side of that rule. In the case of a context-free grammar, the left-hand side is a single non-terminal, so a derivation step consists of replacing a single instance of a non-terminal with one of the possible corresponding right-hand sides.
You never perform two replacements at the same time, and each derivation step is independent of every other derivation step, so in a context-free grammar, there is no way to express the constraint that two non-terminals need to be replaced with the same right-hand side. (In fact, you cannot directly express this constraint with a context-sensitive grammar either, but you can achieve the effect by using markers.)
Regular grammars are a subset of context-free grammars where either every right-hand side which includes a non-terminal has the form aC or every right-hand side which includes a non-terminal has the form Ba. (This comes straight out of the Wikipedia page which you link.) Your grammar is not a regular grammar, because the rule S → aSbS has two non-terminals on the right-hide side, which is not either of the regular forms.

Antlr graphical rule representation

Is there a tool to create a graphical representation of one's antlr4 grammar that means the parser/lexer rules e.g. as a graphical representation of a finite state machine?
It should be the case that it can be represented since it has backus naur form.
Example:
plus: INT '+' INT | plus '+' INT
INT: [0-9]+
A corresponding finite-state machine would be
start -> INT <-> plus
|
v
exit
There may also be other graphical representations but a finite-state machine. The goal is to provide a different perspective in order to make debugging/understanding the grammar easier.
You probably want something like this: https://github.com/bkiers/rrd-antlr4. These types of graphics are called railroad diagrams.
Another solution is to use ANTLRWorks 2.1. There's a view called "Syntax Diagram" included that can generate railroad diagrams of your parser rules and your lexer rules.
I'm using those images for my master thesis and the process works fine so far.

Can a context-sensitive grammar have an empty string?

In one of my cs classes they mentioned that the difference between context-free grammar and context-sensitive grammar is that in CSG, then the left side of the production rule has to be less or equal than the right side.
So, one example they gave was that context-sensitive grammars can't have an empty string because then the first rule wouldn't be satisfied.
However, I have understood that regular grammars are contained in context-free, context-free are contained in context-sensitive, and context-sensitive are contained in recursive enumerable grammars.
So, for example if a grammar is recursive enumerable then is also of the type context-sensitive, context-free and regular.
The problem is that if this happens, then if I have a context-free grammar that contains an empty string then it would not satisfy the rule to be counted as a context-sensitive, but then a contradiction would occur, because each context-sensitive is context-free.
Empty productions ("lambda productions", so-called because λ is often used to refer to the empty string) can be mechanically eliminated from any context-free grammar, except for the possible top-level production S → λ. The algorithm to do so is presented in pretty well every text on formal language theory.
So for any CFG with lambda productions, there is an equivalent CFG without lambda productions which generates the same language, and which is also a context-sensitive grammar. Thus, the prohibition on contracting rules in CSGs does not affect the hierarchy of languages: any context-free language is a context-sensitive language.
Chomsky's original definition of context-sensitive grammars did not specify the non-contracting property, but rather an even more restrictive one: every production had to be of the form αAβ→αγβ where A is a single symbol and γ is not empty. This set of grammars generates the same set of languages as non-contracting grammars (that was also proven by Chomsky), but it is not the same set. Also, his context-free grammars were indeed a subset of context-sensitive grammars because by his original definition of CFGs, lambda productions were prohibited. (The 1959 paper is available online; see the Wikipedia article on the Chomsky hierarchy for a reference link.)
It is precisely the existence of a non-empty context -- α and β -- which leads to the names "context-sensitive" and "context-free"; it is much less clear what "context-sensitive" might mean with respect to an arbitrary non-contracting rule such as AB→BA . (Note 1)
In short, the claim that "every CFG is a CSG" is not technically correct given the common modern usage of CFG and CSG, as cited in your question. But it is only a technicality: the CFG with lambda productions can be mechanically transformed, just as a non-contracting grammar can be mechanically transformed into a grammar fitting Chomsky's definition of context-sensitive (see the Wikipedia article on non-contracting grammars).
(It is also quite common to allow both context-sensitive and context-free languages to include the empty string, by adding an exception for the rule S→λ to both CFG and CSG definitions.)
Notes
In Chomsky's formulation of context-free and -sensitive grammars, it was unambiguous what was meant by a parse tree, even for a CSG; since Chomsky is a linguist and was seeking a framework to explain the structure of natural language, the notion of a parse tree mattered. It is not at all obvious how you might describe the result of AB → BA as a parse tree, although it is quite clear in the case of, for example, a A b → B.

How to prove that a grammar is LL(k) for k>1

I have this exercise which gives me a grammar and asks to prove that it is not an LL(1). All good with that part, though afterwards it asks me if that grammar can be an LL(k)(for k>1) or not. What procedure do I follow to determine that?
For a given k and a non-left-recursive grammar, all you have to do is to build the LA(k) table (by algorithms readily available everywhere). If there is no ambiguity, the grammar is LL(k), and the language is too.
Knowing if there exists a k for which a given language is LL(k) is undecidable. You'd have to try one value of k after the other until you succeed, or the universe runs out.

Mutually left-recursive?

I'm working on a parser for a grammar in ANTLR. I'm currently working on expressions where () has the highest order precedence, then Unary Minus, etc.
When I add the line ANTLR gives the error: The following sets of rules are mutually left-recursive [add, mul, unary, not, and, expr, paren, accessMem, relation, or, assign, equal] How can I go about solving this issue? Thanks in advance.
Easiest answer is to use antlr4 not 3, which has no problem with immediate left recursion. It automatically rewrites the grammar underneath the covers to do the right thing. There are plenty of examples. one could for example examine the Java grammar or my little blog entry on left recursive rules. If you are stuck with v3, then there are earlier versions of the Java grammar and also plenty of material on how to build arithmetic expression rules in the documentation and book.