Got this interview q, that I know is easy but I can't seem to get.
State the following grammar formally as a 4-tuple. (Assume the terminal alphabet is the set of lowercase letters the appear in productions, the nonterminal alphabet is the set of uppercase letters that appear in productions and the start symbol is S.)
S -> abS|X
X -> baX|epsilon
A grammar G is defined as a four-tuple (N, S, E, P) where:
N is a finite set of nonterminal symbols
S, an element of N, is the start symbol
E is a finite set of terminal symbols
N and E are disjoint
P is a set of production, or ordered pairs over (N + E)* x (N + E)*
There is a derivation of string w in grammar G if there is a sequence w[1], w[2], ..., w[n] of productions in P such that:
w[1] = S
For each 1 <= i < n: w[i] = xyz, w[i+1] = xy'z and (y, y') is a production in P
w[n] = w is a string of terminal symbols.
The set of all strings w which have a derivation in G is called the language of G, L(G).
Now, for your grammar:
Nonterminal symbols: S, X
Start symbol: S
Terminal symbols: a, b
Productions: (S, abS), (S, X), (X, baX), (X, epsilon)
Related
I have a question regarding this question:
L= empty where the alphabet is {a,b}
how to create a grammar for this ? how can be the production rule ?
thanks in advance
A grammar G is an ordered 4-tuple {S, N, E, e, P} where:
N is a set of non-terminal symbols
E is a set of terminal symbols
N and E are disjoint
E is a superset of the alphabet of L(G)
e is the empty string
P is a set of ordered pairs of elements of (N U E U e); that is, P is a subset of (N U E U e) X (N U E U e)*.
S, the start symbol, is in N
A derivation in G is a sequence of elements of (N U E U e)* such that:
The first element is S
Adjacent elements w[i] and w[i+1] can be written as w[i] = uxv and w[i+1] = uyv such that (x, y) is in P
If there is a derivation in G whose last element is a string w[n] over (E U e)*, we say G generates w[n]; that is, w[n] is in L(G).
Now, we want to define a grammar G such that L(G) is the empty set. We fix the alphabet E = {a, b}. We must still define:
N, the set of nonterminals
S, the start symbol
P, the productions
We might as well take S as our start symbol. So N contains at least S; N is a superset of {S}. We will only add more nonterminals if we determine we need them. Let us turn our attention to the condition that L(G) is empty.
If L(G) is empty, that means there is no derivation in G that leads to a string of only terminal symbols. We can accomplish this easily be ensuring all our productions produce at least one nonterminal with any terminal. Or produce no terminals at all. So the following grammars would all work:
S := S
or
S := aSb
or
S := aXb | XXSSX
X := aabbXbbaaS
etc. All of these grammars have L(G) empty since none of them can derive a string of nonterminals.
I find difficulties in constructing a Grammar for the language especially with linear grammar.
Can anyone please give me some basic tips/methodology where i can construct the grammar for any language ? thanks in advance
I have a doubt whether the answer for this question "Construct a linear grammar for the language: is right
L ={a^n b c^n | n belongs to Natural numbers}
Solution:
Right-Linear Grammar :
S--> aS | bA
A--> cA | ^
Left-Linear Grammar:
S--> Sc | Ab
A--> Aa | ^
As pointed out in the comments, these grammars are wrong since they generate strings not in the language. Here's a derivation of abcc in both grammars:
S -> aS -> abA -> abcA -> abccA -> abcc
S -> Sc -> Scc -> Abcc -> Aabcc -> abcc
Also as pointed out in the comments, there is a simple linear grammar for this language, where a linear grammar is defined as having at most one nonterminal symbol in the RHS of any production:
S -> aSc | b
There are some general rules for constructing grammars for languages. These are either obvious simple rules or rules derived from closure properties and the way grammars work. For instance:
if L = {a} for an alphabet symbol a, then S -> a is a gammar for L.
if L = {e} for the empty string e, then S -> e is a grammar for L.
if L = R U T for languages R and T, then S -> S' | S'' along with the grammars for R and T are a grammar for L if S' is the start symbol of the grammar for R and S'' is the start symbol of the grammar for T.
if L = RT for languages R and T, then S = S'S'' is a grammar for L if S' is the start symbol of the grammar for R and S'' is the start symbol of the grammar for T.
if L = R* for language R, then S = S'S | e is a grammar for L if S' is the start symbol of the grammar for R.
Rules 4 and 5, as written, do not preserve linearity. Linearity can be preserved for left-linear and right-linear grammars (since those grammars describe regular languages, and regular languages are closed under these kinds of operations); but linearity cannot be preserved in general. To prove this, an example suffices:
R -> aRb | ab
T -> cTd | cd
L = RT = a^n b^n c^m d^m, 0 < a,b,c,d
L' = R* = (a^n b^n)*, 0 < a,b
Suppose there were a linear grammar for L. We must have a production for the start symbol S that produces something. To produce something, we require a string of terminal and nonterminal symbols. To be linear, we must have at most one nonterminal symbol. That is, our production must be of the form
S := xYz
where x is a string of terminals, Y is a single nonterminal, and z is a string of terminals. If x is non-empty, reflection shows the only useful choice is a; anything else fails to derive known strings in the language. Similarly, if z is non-empty, the only useful choice is d. This gives four cases:
x empty, z empty. This is useless, since we now have the same problem to solve for nonterminal Y as we had for S.
x = a, z empty. Y must now generate exactly a^n' b^n' b c^m d^m where n' = n - 1. But then the exact same argument applies to the grammar whose start symbol is Y.
x empty, z = d. Y must now generate exactly a^n b^n c c^m' d^m' where m' = m - 1. But then the exact same argument applies to the grammar whose start symbol is Y.
x = a, z = d. Y must now generate exactly a^n' b^n' bc c^m' d^m' where n' and m' are as in 2 and 3. But then the exact same argument applies to the grammar whose start symbol is Y.
None of the possible choices for a useful production for S is actually useful in getting us closer to a string in the language. Therefore, no strings are derived, a contradiction, meaning that the grammar for L cannot be linear.
Suppose there were a grammar for L'. Then that grammar has to generate all the strings in (a^n b^n)R(a^m b^m), plus those in e + R. But it can't generate the ones in the former by the argument used above: any production useful for that purpose would get us no closer to a string in the language.
I have got this grammar:
G = (N, Epsilon, P, S)
with
N = {S, A, B}
Epsilon = {a},
P: S -> e
S -> ABA
AB -> aa
aA -> aaaA
A -> a
Why is this a grammar of only type 0?
I think it is because of aA -> aaaA, but I don't see how it is in conflict with the rules.
The rules have to be built like this:
x1 A x2 -> x1 B x2 while:
A is element of N;
x1,x2 are elements of V*;
and B is element of VV*;
With V = N united Epsilon, I don't see the problem here.
a is from V, and A is from N, while right of A there could be the empty word, which would also be part of V*, so the left side would be okay.
On the right side, there is x1 again, being a, then we could say aaA is part of VV*, with aa being V and A being V*, while the right part is x2, so empty again.
"The rules have to be built like this:
x1 A x2 -> x1 B x2 while:...."
yes, it's correct. But, exists an equivalent definition of the rules (of type-1 grammars):
p->q where
p,q is element of V^+ and length(p)<=length(q) and -naturally- p has an element of N.
Your grammar has only rules, that satisfy this form => your grammar is type-1
Using pumping lemma, we can easily prove that the language L1 = {WcW^R|W ∈ {a,b}*} is not a regular language. (the alphabet is {a,b,c}; W^R represents the reverse string W)
However, If we replace character c with "x"(x ∈ {a,b}+), say, L2 = {WxW^R| x, W ∈ {a,b}^+}, then L2 is a regular language.
Could you give me some ideas?
If we replace character c with x where (x ∈ {a,b}+), say, L2 = {WXWR| x, W ∈ {a,b}+}, then L2 is a regular language.
Yes, L2 is Regular Language :).
You can write regular expression for L2 too.
Language L2 = {WXWR| x, W ∈ {a,b}+} means:
string should start any string consist of a and b that is W and end with reverse string WR.
notice: because W and WR are reverse of each other so string start and end with same symbol (that can be either a or b)
And contain any string of a and b in middle that is X. (because of +, length of X becomes greater than one |X| >= 1)
Example of this kind of strings can be following:
aabababa, as follows:
a ababab a
-- -------- --
w X W^R
or it can be also:
babababb, as follows:
b ababab b
-- -------- --
w X W^R
See length of W is not a constraint in language definition.
so any string WXWR can be assume equals to a(a + b)+a or b(a + b)+b
a (a + b)+ a
--- -------- ---
W X W^R
or
b (a + b)+ b
--- -------- ---
W X W^R
And Regular Expression for this language is: a(a + b)+a + b(a + b)+b
Don't mix WXWR with WCWR, its X with + that makes language regular. Think by including X that is (a + b)* we can have finite choice for W that is a and b (finite is regular).
Language WXWR can be say: if start with a ends with a and if start with b end with b. so correspondingly we need two final states.
Q6 if W is a
Q5 if W is b
ITs DFA is as given below.
Any string in the language with |W| > 1 can be interpreted as a string in the language where |W| = 1. Thus, a string is in the language if it begins and ends with the same symbol. There are two symbols: a and b. So that language is equivalent to the language a(a+b)(a+b)*a + b(a+b)(a+b)*b. To prove this, you should formalize the argument that "if y is in WxW, then y is in a(a+b)(a+b)*a + b(a+b)(a+b)*b; and if y is in a(a+b)(a+b)*a + b(a+b)(a+b)*b, then y is in WxW".
It doesn't work in the other case since c is a fixed symbol, and can't include all but the characters on the ends. As soon as you bound the length of "x" in your example, the language becomes non-regular.
The question says W ∈ {a,b}^+ , so a^n(a+b)a^n should be in the language L2. Now there is no such DFA that will accept the string a^n(a+b)a^n because, after accepting n number of a and (a+b)^+, there is no way for the dfa to remember exactly how many a it accepted in the begining, so L2 should not be regular.........But every where i search for this answer it says it is regular.....this bugs me
Suppose I have a substitution S and list Xs, where each variable occurring in Xs also occurs in S. How would I find the list S(Xs), i.e., the list obtained by applying the substitution S to the list Xs.
More concretely, I have a set of predicates and DCG rules that look something like
pat(P) --> seg(_), P, seg(_).
seg(X,Y,Z) :- append(X,Z,Y).
If I attempt to match a pattern P with variables against a list, I receive a substitution S:
?- pat([a,X,b,Y],[d,a,c,b,e,d],[]).
X = c,
Y = e
I want to apply the substitution S = {X = c, Y = e} to a list Xs with variables X and Y, and receive the list with substitutions made, but I'm not sure what the best way to approach the problem is.
If I were approaching this problem in Haskell, I would build a finite map from variables to values, then perform the substitution. The equivalent approach would be to produce a list in the DCG rule of pairs of variables and values, then use the map to find the desired list. This is not a suitable approach, however.
Since the substitution is not reified (is not a Prolog object), you can bind the list to a variable and let unification do its work:
?- Xs = [a,X,b,Y], pat(Xs,[d,a,c,b,e,d],[]).
Xs = [a, c, b, e],
X = c,
Y = e .
Edit: If you want to keep the original list around after the substitution, use copy_term:
?- Xs = [a,X,b,Y], copy_term(Xs,Ys), pat(Xs,[d,a,c,b,e,d],[]).
Xs = [a, c, b, e],
X = c,
Y = e,
Ys = [a, _G118, b, _G124] .