Specifying language for a given grammar - grammar

DFA problem : Write a complete grammar for L, including the quadruple and the production rules
L ={x: ∃y ∈ {a, b}* : x = ay}
Answer:
G={{S, A}, {a, b}, S, P}
P: S => aA
A => aA | bA | λ
My question is :
Why there is λ for A, but there is no λ for S?
From the language definition, it is any string that begins with an a and contains only a's and b's , but why in the answer A => bA. Does not it mean that the string starts with b if it is A => bA?
Thank you so much

1. Why there is λ for A, but there is no λ for S?
λ nul can be derived from A to convert a sentimental from into sentence. Additionally according to language statement prefix sub-string y ∈ {a, b}* can be nul (a empty string) e.g. "a" is a string belongs to the language. If y contain any symbol then length of language will be more than one.
S doesn't derive λ nul because empty (or say nul string) is not in language. The smallest string in language is single "a".
2. From the language definition, it is any string that begins with an a and contains only a's and b's , but why in the answer A => bA. Does not it mean that the string starts with b if it is A => bA?
Note only strings those can derived from start variable S are included in language of grammar. You can't start derivation from A (that is not start variable). And if you start a derivation from S your string will always start with a symbol.
I suggest you to read: "Why the need for terminals? Is my solution sufficient enough?" Where I written about basic definition of formal grammar.

Related

Definition of First and Follow sets of the right-hand sides of production

I am learning about LL(1) grammars. I have a task of checking if grammar is LL(1) and if not, I then need to find the rules, which prevent it from being LL(1). I came across this link https://www.csd.uwo.ca/~mmorenom/CS447/Lectures/Syntax.html/node14.html which has a theorem which can be used as a criteria for deciding if grammar is LL(1) or not. It says that for any rule A -> alpha | beta some equalities, considering FIRST and FOLLOW sets need to be true. Therefore, I need to find FIRST and FOLLOW sets of these right-hand sides of production.
Let's say, I have following rules A -> a b B S | eps. How do I calculate FIRST and FOLLOW of a b B S? As far as I understand by definition these sets are defined only for 1 non-terminal symbol.
The idea behind the FIRST function is that it returns the set of terminals which could possibly start the expansion of its argument. It's usual to also add the special object ε (which is a way of writing an empty sequence of symbols) if ε is a possible expansion.
So if a is a terminal, FIRST(a) is just { a }. And if A is a non-terminal, FIRST(A) is the set of non-terminals which could possibly appear at the beginning of a derivation of A. Finally, FIRST(ε) must be { ε }, according to the convention described above.
Now suppose α is a (possibly empty) sequence of grammar symbols:
If α is empty (that is, it's ε), FIRST(α) is { ε }
If the first symbol in α is the terminal a, FIRST(α) is { a }.
If the first symbol in α is the non-terminal A, there are two possibilities. Let TAIL(α) be the rest of α after the first symbol. Now:
if ε ∈ FIRST(A), then FIRST(α) is FIRST(A) ∪ FIRST(TAIL(α)).
otherwise, FIRST(α) is FIRST(A).
Now, how do we compute FIRST(A), for every non-terminal A? Using the above definition of FIRST(α), we recursively define FIRST(A) to be the union of the sets FIRST(α) for every α which is the right-hand side of a production A → α.
The FOLLOW function defines the set of terminal symbols which might appear after the expansion of a non-terminal. It is only defined on non-terminals; if you look carefully at the LL(1) conditions on the page you cite, you'll see that FIRST is applied to a right-hand side, while FOLLOW is only applied to left-hand sides.

Formal Languages - Grammar

I am taking a Formal Languages and Computability class and am having a little trouble understanding the concept of grammar. One of my assignment questions is this:
Take ∑ = {a,b}, and let na(w) and nb(w) denote the number of a's and b's in the string w, respectively. Then the grammar G with productions:
S -> SS
S -> λ
S -> aSb
S -> bSa
generates the language L = {w: na(w) = nb(w)}.
1) The language in the example contains an empty string. Modify the given grammar so that it generates L - {λ}.
I am thinking that I should modify the condition of L, something like:
L = {w: na(w) = nb(w), na, nb > 0}
That way, we indicate that the string is never empty.
2) Modify the grammar in the example so that it will generate L ∪ {anbn+1: n >= 0}.
I am not sure on how to do this one. Should that mean I make one more condition in the grammar, adding something like S -> aSbb?
Any explanation about these two questions would be greatly appreciated. I'm still trying to figure these grammar stuff out so I am not sure about my answers.
1) The question is about modifying the grammar to obtain a new language; so don't modify directly the language…
Your grammar generates the empty word because of the production:
S -> λ
So you could think of removing this production altogether. This yields the following grammar:
S -> SS
S -> aSb
S -> bSa
Unfortunately, this grammar doesn't generate a language (a bit like in induction, it misses an initial: there are no productions that only consist of terminals). To fix this, add the following productions:
S -> ab
S -> ba
2) Don't randomly try to add production rules in the hope that it's going to work. Here you want a's followed by b's. So the production rule
S -> bSa
must certainly disappear. Also, the rule
S -> SS
would produce, e.g., abab (try to see how this is obtained). So we'll have to remove it too. We're left with:
S -> λ
S -> aSb
Now this grammar generates:
λ
ab
aabb
aaabbb
etc. That's not bad at all! To get an extra trailing b, we could create a new non-terminal, say T, replace our current S by T, and add that trailing b in S:
T -> λ
T -> aTb
S -> Tb
I know that this is homework; I gave you the solutions to your homework: that's because, from the way you asked your question, it seems you're completely lost. I hope this answer will help you get on the right path!

Computation of follow set

To compute FOLLOW(A) for all non-terminals A, apply the following rules
until nothing can be added to any FOLLOW set.
Place $ in FOLLOW(S) , where S is the start symbol, and $ is the input
right endmarker .
If there is a production A -> B, then everything in FIRST(b) except epsilon
is in FOLLOW(B) .
If there is a production A -> aBb, or a production A -> aBb, where
FIRST(b) contains t, then everything in FOLLOW(A) is in FOLLOW(B).
a,b is actually alpha and beta(sentential form). This is from dragon book.
Now my question is in this case can we take a=epsilon ?
and can b(beta) be 2 non-terminals like XY? (if senetntial then it solud be..)
Here's what the Dragon book actually says: [See note 1]
Place $ in FOLLOW(S).
For every production A→αBβ, place everything
in FIRST(β) except ε into
FOLLOW(B)
For every production A→αB or
A→αBβ where FIRST(β) contains
ε, place FOLLOW(A) into
FOLLOW(B).
There is a section earlier in the book on "notational conventions" in which it is made clear that a lower-case greek letter like α or β represents a possibly empty string of grammar symbols. So, yes, α could be empty and β could be two nonterminals (or any other string of grammar symbols).
Note:
Here I'm using a variant on the formatting suggesting made by #leftroundabout in this meta post. (The only difference is that I put the formulae in bold.) It's easy to type Greek letters as entities if you don't have a Greek keyboard handy; just use, for example, α (α) or β (β). For upper-case Greek letters, write the name with an upper-case letter: Σ (Σ). Other useful symbols are arrows: → (→) and ⇒ (⇒).

is this regular grammar- S -> 0S0/00?

Let L denotes the language generated by the grammar S -> 0S0/00. Which of the following is true?
(A) L = 0+
(B) L is regular but not 0+
(C) L is context free but not regular
(D) L is not context free
HI can anyone explain me how the language represented by the grammar S -> 0S0/00 is regular? I know very well the grammar is context free but not sure how can that be regular?
If you mean the language generated by the grammar
S -> 0S0
S -> 00
then it should be clear that it is the same language as is generated by
S -> 00S
S -> 00
which is a left regular grammar, and consequently generates a regular language. (Some people would say that a left regular grammar can only have a single terminal in each production, but it is trivial to create a chain of aN productions to produce the same effect.)
It should also be clear that the above differs from
S -> 0S
S -> S
We know that a language is regular if there exists a DFA (deterministic finite automata) that recogognizes it, or a RE (Regular expression). Either way we can see here that your grammar generates word like : 00, 0000, 000000, 00000000.. etc so it's words that starts and ends with '0' and with an even number of zeroes greater or equal than length two.
Here's a DFA for this grammar
Also here is a RE (Regular expression) that recognizes the language :
(0)(00)*(0)
Therefore you know this language recognized by this grammar is regular.
(Sorry if terms aren't 100% accurate, i took this class in french so terms might differ a bit) let me know if you have any other questions!
Consider first the definition of a regular grammar here
https://www.cs.montana.edu/ross/theory/contents/chapter02/green/section05/page04.xhtml
So first we need a set N of non terminal symbols (symbols that can be rewritten as a combination of terminal and non-terminal symbols), for our example N={S}
Next we need a set T of terminal symbols (symbols that cannot be replaced), for our example T={0}
Now a set P of grammer rules that fit a very specific form (see link), for L we see that P={S->0S0,S->00}. Both of these rules are of regular form (meaning each non-terminal can be replaced with a terminal, a terminal then a non-terminal, or the empty string, see link for more info). So we have our rules.
Now we just need a starting symbol X, we can trivally say that our starting symbol is S.
Therefore the tuple (N={S},T={0},P={S->0S0,S->00},X=S) fits the requirements to be defined a regular grammar.
We don't need the machinery of regular grammars to answer your question. Just note the possible derivations all look like this:
S -> (0 S 0) -> 0 (0 S 0) 0 -> 0 0 (0 S 0) 0 0 -> ... -> 0...0 (0 0) 0...0
\_ _/ \_ _/
k k
Here I've added parens ( ) to show the result of the previous expansion of S. These aren't part of the derived string. I.e. we substitute S with 0 S 0 k >= 0 times followed by a single substitution with 00.
From this is should be easy to see L is the set of strings of 0's of length 2k + 2 for some integer k >= 0. A shorthand notation for this is
L = { 02m | m >= 1 }
In words: The set of all even length strings of zeros excluding the empty string.
To prove L is regular, all we need is a regular expression for L. This is easy: (00)+. Or if you prefer, 00(00)*.
You might be confused because a small change to the grammar makes its language context free but not regular:
S -> 0S1/01
This is the more complex language { 0m 1m | m >= 1 }. It's straightforward to show this isn't a regular language using the Pumping Lemma.

Defining a language in EBNF

Give the EBNF specification for the language L that is made up of the chars a, b and c such that sentences in the language have the form
L : sqsR
-s is a string of any combination of the characters a and b
-sR is that same string s reversed
-q is an odd number of c's followed by either an odd number of b's
or an even number of a’s.
What I have so far:
L -> S
S -> {a}{b}Q
Q ->
If this is right, I'm still not really sure how to produce from Q and also how to represent S in reverse.
This is a string that starts and ends with the same string, but reversed:
X -> aXa
-> bXb
This is a string with an odd number of c's:
Y -> cY2
Y2 -> ccY2
I've left out some crucial bits, but hopefully this can get you started.
Try building the first two parts from the middle out
You can force an odd number of repetitions by starting with exactly one item and adding N*2 additional items (for integer N). This should suggest how to force an even number as well