How to convert a regular grammar to finite automaton? - finite-automata

How does one convert a regular grammar into a finite automaton (FA)? For instance, what would a finite automaton corresponding to the following regular grammar look like?
VN = {S, B, D} (nonterminals)
VT = {a, b, c} (terminals)
P = {S -> aB, S -> bB, B -> bD, D -> b, D -> aD, B -> cB, B -> aS} (productions)

The good news is that this is not too hard. The idea is that each of the nonterminals will become a state in a nondeterministic finite automaton accepting the language of the grammar, and productions will become transitions. Our NFA will have states S, B and D, and will transition among those states according to the production rules. Our NFA looks like this:
___a__ _a_
/ \ / \
| | \ /
V | \ /
----->S-a,b->B--b-->D
/ \
/ \
\_c_/
There was one dangling production D -> b which we haven't added yet. We need to introduce another state, not corresponding to an nonterminal symbol, to allow us to transition from D on b and accept some strings:
___a__ _a_
/ \ / \
| | \ /
V | \ /
----->S-a,b->B--b-->D--b-->Q
/ \
/ \
\_c_/
Now if we make S the initial state and Q the accepting state, we have an NFA that works. If we want a DFA, we might notice that this FA is only nondeterministic because we are lacking required transitions from states S, D and Q. We can add the missing transitions by introducing a new dead state X which will keep track of the NFA we just derived having crashed at some point during its processing:
___a__ _a_
/ \ / \
| | \ /
V | \ /
----->S-a,b->B--b-->D--b-->Q
| / \ | |
| / \ | | a,b,c
c \_c_/ c a,b,c / \
| | | \ /
V V V \ /
+-------------+------+----->X

Related

CFG for constituent tree of As Bs and as and bs

I have the following constituent tree, and I am confused on how to find the context-free grammar as I am not used to these notations of As and Bs.
S
/ \
A A
/ \ |
A B a
/ \ |
A B b
| |
a b
I thought of the following CFG:
S -> AA
A -> AB
A -> a
B -> b
Does this make sense?

Context Free Grammar BNF

need help with a non-extended BNF grammar:
Σ = {a,b,c}
L = {ω ɛ Σ^* | such that all a's (if any) comes before all c's(if any)}
For example, the strings aba, cbc, and abacbc are in the language, but string abcabc is not.
This is what i have so far (is it correct ? please correct me if i am wrong):
s->asbsc|bsasc|ascsb|ɛ
Your comment says you want equal numbers of a and c, so start with the simple grammar that does that:
S -> aSc | ε
and add in any number of b's before/after/between those:
S -> BaScB | B
B -> Bb | ε
note that the above is not ambiguous (it's even LR(1)).
If you want to allow a different number of a's and c's, you can use the same approach to avoid ambiguity. Start with just the a's and c's:
S -> AC
A -> Aa | ε
C -> Cc | ε
and add in b's at the beginning and after each other character:
S -> BAC
A -> AaB | ε
C -> CcB | ε
B -> Bb | ε
Do the number of a's and c's need to be the same? If, not then you are missing those cases where they differ, such as: aac. I think something like this should work:
S -> AC
A -> aA | bA | ε
C -> bC | cC | ε
The A production is used for deriving a sequence of characters that are not a c and the C production is used for deriving a sequence of characters that are not an a.

Ambiguous grammar

I am looking at the following grammar and I believe its Ambiguous at line 3, but not sure.
<SL> → <S>
<SL> → <SL> <S>
<S> → i <B> <S> e <S>
<S> → i <B> <S>
<S> → x
<S> → y
<B> → 5
<B> → 13
I found this string xi13yi5xeyx I believe generates two different parse trees, but I'm not sure if im doing it wrong.
Can some one please verify my findings?
Yes your grammar is an ambiguous grammar!
You have not mention But I think <SL> is start viable
Using your grammar rules we can draw more then one parse tree(two) for wring i5i5yey as followes:
<SL> <SL>
| |
<S> <S>
/ /|\ \ / | \
/ / | \ \ / | \
/ / | \ \ / | \
/ / | \ \ i <B> <S>
/ | | | \ | / /|\ \
i <B> <S> e <S> 5 / / | \ \
/ / | \ | / / | \ \
/ / | \ y / / | \ \
5 i <B> <S> / | | | \
| | i <B> <S> e <S>
5 y | | |
5 y y
Structure of both parse tree are different two the grammar is an ambiguous grammar!
You can extend above diagram to generate tree string xi13yi5xeyx, (I am leaving this as an exercise for you)
Important is the language generate by this grammar is not ambiguous language.And its possible to write an equivalent unambiguous grammar for this grammar that always generates unique tree for each string in language of grammar.
HINT: To write unambiguous grammar.
The grammar is quite similar to grammar for if loop in C language (notice different language having different syntax for if loop). and it solved in almost all compiler design book.
Resolving the General Dangling Else/If-Else Ambiguity
Reference: Book Compilers Principles, Technique, and tools by Aho-Ullman Section 4.5 conflicts During Shift and-Reduce Parsing.

Tips for creating "Context Free Grammar"

I am new to CFG's,
Can someone give me tips in creating CFG that generates some language
For example
L = {am bn | m >= n}
What I got is:
So -> a | aSo | aS1 | e
S1 -> b | bS1 | e
but I think this area is wrong, because there is a chance that the number of b's can be greater than a's.
How to write CFG with example ambn
L = {am bn | m >= n}.
Language description: am bn consist of a followed by b where number of a are equal or more then number of b.
some example strings: {^, a, aa, aab, aabb, aaaab, ab......}
So there is always one a for one b but extra a are possible. infect string can be consist of a only. Also notice ^ null is a member of language because in ^ NumberOf(a) = NumberOf(b) = 0
How to write a grammar that accepts the language formed by strings am bn?
In the grammar, there should be rules such that if you add a b symbol you also add a a symbol.
and this can be done with something like:
S --> aSb
But this is incomplete because we need a rule to generate extra as:
A --> aA | a
Combine two production rules into a single grammar CFG.
S --> aSb | A
A --> aA | a
So you can generate any string that consist of a also a and b in (am bn) pattern.
But in above grammar there is no way to generate ^ string.
So, change this grammar like this:
S --> B | ^
B --> aBb | A
A --> aA | a
this grammar can generate {am bn | m >= n} language.
Note: to generate ^ null string, I added an extra first step in grammar by adding S--> B | ^, So you can either add ^ or your string of symbol a and b. (now B plays role of S from previous grammar to generate equal numbers of a and b)
Edit: Thanks to #Andy Hayden
You can also write equivalent grammar for same language {am bn | m >= n}:
S --> aSb | A
A --> aA | ^
notice: here A --> aA | ^ can generate zero or any number of a. And that should be preferable to my grammar because it generates a smaller parse tree for the same string.
(smaller in height preferable because of efficient parsing)
The following tips may be helpful to write Grammar for a formal language:
You are to be clear about language that what it describes (meaning/pattern).
You can remember solutions for some basic problems(the idea being that you can write new grammars).
You can write rules for fundamental languages like I have written for RE in this example to write Right-Linear-Grammmar. The rules will help you to write Grammar for New Languages.
One different approach is to first draw automata, then convert automata to Grammar. We have predefined techniques to write grammar from automata from any class of formal language.
Like a Good Programmer who learns by reading the code of others, similarly one can learn to write grammars for formal languages.
Also the grammar you have written is wrong.
you want to create a grammar for following language
L= {an bm | m>=n }
that means number of 'b' should be greater or equal then number of 'a'
or you can say that for each 'b' there could at most one 'a'. not other way around.
here is grammar for this language
S-> aSb | Sb | b | ab
in this grammar for each 'a' there is one 'b'. but b can be generated without generating any 'a'.
you can also try these languages:
L1= {an bm | m > n }
L2= {an bm | m >= 2n }
L3= {an bm | 2m >= n }
L4= {an bm | m != n }
i am giving grammar for each language.
for L1
S-> aSb | Sb | b
for L2
S-> aSbb | Sb | abb
for L3
S-> AASb | Sb | aab | ab | b
for L4
S-> S1 | S2
S1-> aS1b | S1b | b
S2-> aS2b | aS2 | a
Least variables: S -> a S b | a S | e
with less variables :
S -> a S b | a S | a b | e

First & Follow, Arithmetic Expressions

FIRST(A) = { b, epsilon }
FIRST(S) = { b, epsilon }
FOLLOW(S) = { a, $ }
FOLLOW(A) = { a, b, $ }
What is the Arithmetic Expressions for this First & Follow set?
FIRST(X) = the terminals which can appear first when trying parse the non-terminal X. If it can match an empty string, epsilon is also included.
FOLLOW(X) = the terminals which can appear immediately after the non-terminal X. This is a union of the FIRST-sets of all symbols appearing after X in any parsing rule.
Read more: LL parser
The clues given are:
FIRST(A), FIRST(S) ⇒ All of the derivations of A and S respectively, must either begin with the terminal b, or be zero-length.
S → b ... | ε
A → b ... | ε
FOLLOW(S) ⇒ There must be some construction where S is followed by the terminal a, or a non-terminal which can begin with a. (Neither A nor S qualify).
S → b S a | ε
A → b ... | ε
FOLLOW(A) ⇒ There must be some construction where A is followed by each of the terminals a and b, or some non-terminal which can begin with those.
S → b S a | ε
A → b A b | b A a | ε
FOLLOW(A) ⇒ Assuming S is the start-symbol, A must appear at the end of some branch of S, possibly followed by other nullable non-terminals.
S → b S a | A | ε
A → b A b | b A a | ε
(NB. Adding A to S did not break the constraint on FIRST(S))
We can make the grammar a little smaller:
S → b S a | A | ε
A → b A b | ε
We can no longer generate strings like "bbbabb", but it does not violate the constraints.