Defining a language in EBNF - grammar

Give the EBNF specification for the language L that is made up of the chars a, b and c such that sentences in the language have the form
L : sqsR
-s is a string of any combination of the characters a and b
-sR is that same string s reversed
-q is an odd number of c's followed by either an odd number of b's
or an even number of a’s.
What I have so far:
L -> S
S -> {a}{b}Q
Q ->
If this is right, I'm still not really sure how to produce from Q and also how to represent S in reverse.

This is a string that starts and ends with the same string, but reversed:
X -> aXa
-> bXb
This is a string with an odd number of c's:
Y -> cY2
Y2 -> ccY2
I've left out some crucial bits, but hopefully this can get you started.

Try building the first two parts from the middle out
You can force an odd number of repetitions by starting with exactly one item and adding N*2 additional items (for integer N). This should suggest how to force an even number as well

Related

Formal Languages - Grammar

I am taking a Formal Languages and Computability class and am having a little trouble understanding the concept of grammar. One of my assignment questions is this:
Take ∑ = {a,b}, and let na(w) and nb(w) denote the number of a's and b's in the string w, respectively. Then the grammar G with productions:
S -> SS
S -> λ
S -> aSb
S -> bSa
generates the language L = {w: na(w) = nb(w)}.
1) The language in the example contains an empty string. Modify the given grammar so that it generates L - {λ}.
I am thinking that I should modify the condition of L, something like:
L = {w: na(w) = nb(w), na, nb > 0}
That way, we indicate that the string is never empty.
2) Modify the grammar in the example so that it will generate L ∪ {anbn+1: n >= 0}.
I am not sure on how to do this one. Should that mean I make one more condition in the grammar, adding something like S -> aSbb?
Any explanation about these two questions would be greatly appreciated. I'm still trying to figure these grammar stuff out so I am not sure about my answers.
1) The question is about modifying the grammar to obtain a new language; so don't modify directly the language…
Your grammar generates the empty word because of the production:
S -> λ
So you could think of removing this production altogether. This yields the following grammar:
S -> SS
S -> aSb
S -> bSa
Unfortunately, this grammar doesn't generate a language (a bit like in induction, it misses an initial: there are no productions that only consist of terminals). To fix this, add the following productions:
S -> ab
S -> ba
2) Don't randomly try to add production rules in the hope that it's going to work. Here you want a's followed by b's. So the production rule
S -> bSa
must certainly disappear. Also, the rule
S -> SS
would produce, e.g., abab (try to see how this is obtained). So we'll have to remove it too. We're left with:
S -> λ
S -> aSb
Now this grammar generates:
λ
ab
aabb
aaabbb
etc. That's not bad at all! To get an extra trailing b, we could create a new non-terminal, say T, replace our current S by T, and add that trailing b in S:
T -> λ
T -> aTb
S -> Tb
I know that this is homework; I gave you the solutions to your homework: that's because, from the way you asked your question, it seems you're completely lost. I hope this answer will help you get on the right path!

is this regular grammar- S -> 0S0/00?

Let L denotes the language generated by the grammar S -> 0S0/00. Which of the following is true?
(A) L = 0+
(B) L is regular but not 0+
(C) L is context free but not regular
(D) L is not context free
HI can anyone explain me how the language represented by the grammar S -> 0S0/00 is regular? I know very well the grammar is context free but not sure how can that be regular?
If you mean the language generated by the grammar
S -> 0S0
S -> 00
then it should be clear that it is the same language as is generated by
S -> 00S
S -> 00
which is a left regular grammar, and consequently generates a regular language. (Some people would say that a left regular grammar can only have a single terminal in each production, but it is trivial to create a chain of aN productions to produce the same effect.)
It should also be clear that the above differs from
S -> 0S
S -> S
We know that a language is regular if there exists a DFA (deterministic finite automata) that recogognizes it, or a RE (Regular expression). Either way we can see here that your grammar generates word like : 00, 0000, 000000, 00000000.. etc so it's words that starts and ends with '0' and with an even number of zeroes greater or equal than length two.
Here's a DFA for this grammar
Also here is a RE (Regular expression) that recognizes the language :
(0)(00)*(0)
Therefore you know this language recognized by this grammar is regular.
(Sorry if terms aren't 100% accurate, i took this class in french so terms might differ a bit) let me know if you have any other questions!
Consider first the definition of a regular grammar here
https://www.cs.montana.edu/ross/theory/contents/chapter02/green/section05/page04.xhtml
So first we need a set N of non terminal symbols (symbols that can be rewritten as a combination of terminal and non-terminal symbols), for our example N={S}
Next we need a set T of terminal symbols (symbols that cannot be replaced), for our example T={0}
Now a set P of grammer rules that fit a very specific form (see link), for L we see that P={S->0S0,S->00}. Both of these rules are of regular form (meaning each non-terminal can be replaced with a terminal, a terminal then a non-terminal, or the empty string, see link for more info). So we have our rules.
Now we just need a starting symbol X, we can trivally say that our starting symbol is S.
Therefore the tuple (N={S},T={0},P={S->0S0,S->00},X=S) fits the requirements to be defined a regular grammar.
We don't need the machinery of regular grammars to answer your question. Just note the possible derivations all look like this:
S -> (0 S 0) -> 0 (0 S 0) 0 -> 0 0 (0 S 0) 0 0 -> ... -> 0...0 (0 0) 0...0
\_ _/ \_ _/
k k
Here I've added parens ( ) to show the result of the previous expansion of S. These aren't part of the derived string. I.e. we substitute S with 0 S 0 k >= 0 times followed by a single substitution with 00.
From this is should be easy to see L is the set of strings of 0's of length 2k + 2 for some integer k >= 0. A shorthand notation for this is
L = { 02m | m >= 1 }
In words: The set of all even length strings of zeros excluding the empty string.
To prove L is regular, all we need is a regular expression for L. This is easy: (00)+. Or if you prefer, 00(00)*.
You might be confused because a small change to the grammar makes its language context free but not regular:
S -> 0S1/01
This is the more complex language { 0m 1m | m >= 1 }. It's straightforward to show this isn't a regular language using the Pumping Lemma.

regular/context free grammar

Im hoping someone can help me understand a question I have, its not homework, its just an example question I am trying to work out. The problem is to define a grammar that generates all the sums of any number of operands. For example, 54 + 3 + 78 + 2 + 5... etc. The way that I found most easy to define the problem is:
non-terminal {S,B}
terminal {0..9,+,epsilon}
Rules:
S -> [0..9]S
S -> + B
B -> [0..9]B
B -> + S
S -> epsilon
B -> epsilon
epsilon is an empty string.
This seems like it should be correct to me as you could define the first number recursively with the first rule, then to add the next integer, you could use the second rule and then define the second integer using the third rule. You could then use the fourth rule to go back to S and define as many integers as you need.
This solution seems to me to be a regular grammar as it obeys the rule A -> aB or A -> a but in the notes it says for this question that it is no possible to define this problem using a regular grammar. Can anyone please explain to me why my attempt is wrong and why this needs to be context free?
Thanks.
Although it's not the correct definition, it's easier to think that for a language to be non-regular it would need to balance something (like parenthesis).
Your problem can be solved using direct recursion only on the sides of the rules, not in the middle, so it can be solved using a regular language. (Again, this is not the correct definition, but it's easier to remember!)
For example, for a regular expression engine (like in Perl or JavaScript) one could easily write /(\d+)(\+(\d+))*/.
You could write it this way:
non-terminal {S,R,N,N'}
terminal {0..9,+,epsilon}
Rules:
S -> N R
S -> epsilon
N -> [0..9] N'
N' -> N
N' -> epsilon
R -> + N R
R -> epsilon
Which should work correctly.
The language is regular. A regular expression would be:
((0|1|2|...|9)*(0|1|2|...|9)+)*(0|1|2|...|9)*(0|1|2|...|9)
Terminals are: {0,1,2,...,9,+}
"|" means union and * stands for Star closure
If you need to have "(" and ")" in your language, then it will not be regular as it needs to match parentheses.
A sample context free grammar would be:
E->E+E
E->(E)
E->F
F-> 0F | 1F | 2F | ... | 9F | 0 | 1 | ... | 9

Read free format with no advance

In a given file record, I need to read the first two integer elements at first, and then the rest of the line (a large number of real elements), because the assignment depend on the first 2. Suppose the format of the first two integer elements is not really well defined.
The best way to solve the problem could be something:
read(unitfile, "(I0,I0)", advance='no') ii, jj
read(unitfile,*) aa(ii,jj,:)
But it seems to me the "(I0)" specification is not allowed in gfortran.
Basically the file read in unitfile could be something like:
0 0 <floats>
0 10 <floats>
10 0 <floats>
100 0 <floats>
100 100 <floats>
which is hard to be read with any fortran-like fixed field format specification.
Is there any other way to get around this, apparently trivial, problem?
This applies string manipulations to get the individual components, separated by blanks ' ' and/or tabs (char(9)):
program test
implicit none
character(len=256) :: string, substring
integer :: ii, jj, unitfile, stat, posBT(2), pos
real, allocatable :: a(:)
open(file='in.txt', newunit=unitfile, status='old' )
read(unitfile,'(a)') string
! Crop whitespaces
string = adjustl(trim(string))
! Get first part:
posBT(1) = index(string,' ') ! Blank
posBT(2) = index(string,char(9)) ! Tab
pos = minval( posBT, posBT > 0 )
substring = string(1:pos)
string = adjustl(string(pos+1:))
read(substring,*) ii
! Get second part:
posBT(1) = index(string,' ') ! Blank
posBT(2) = index(string,char(9)) ! Tab
pos = minval( posBT, posBT > 0 )
substring = string(1:pos)
string = adjustl(string(pos+1:))
read(substring,*) jj
! Do stuff
allocate( a(ii+jj), stat=stat )
if (stat/=0) stop 'Cannot allocate memory'
read(string,*) a
print *,a
! Clean-up
close(unitfile)
deallocate(a)
end program
For a file in.txt like:
1 2 3.0 4.0 5.0
This results in
./a.out
3.00000000 4.00000000 5.00000000
NOTE: This is just a quick&dirty example, adjust it to your needs.
[This answer has been significantly revised: the original was unsafe. Thanks to IanH for pointing that out.]
I generally try to avoid doing formatted input which isn't list-directed, when I can afford it. There's already an answer with string parsing for great generality, but I'll offer some suggestions for a simpler setting.
When you are relaxed about trusting the input, such as when it's just the formatting that's a bit tricky (or you 're happy leaving it to your compiler's bounds checking), you can approach your example case with
read(unitfile, *) ii, jj, aa(ii, jj, :)
Alternatively, if the array section is more complicated than given directly by the first two columns, it can be by an expression, or even by functions
read(unitfile, *) ii, jj, aa(fi(ii,jj), fj(ii,jj), :fn(ii,jj))
with pure integer function fi(ii,jj) etc. There is even some possibility of having range validation in those functions (returning a size 0 section, for example).
In a more general case, but staying list-directed, one could use a buffer for the real variables
read(unitfile, *) ii, jj, buffer(:) ! Or ... buffer(:fn(ii,jj))
! Validate ii and jj before attempting to access aa with them
aa(.., .., :) = buffer
where buffer is of suitable size.
Your first considered approach suggests you have some reasonable idea of the structure of the lines, including length, but when the number of reals is unknown from ii and jj, or when the type (and polymorphism reading isn't allowed) is not known, then things do indeed get tricky. Also, if one is very sensitive about validating input, or even providing meaningful detailed user feedback on error, this is not optimal.
Finally, iostat helps.

Specifying language for a given grammar

DFA problem : Write a complete grammar for L, including the quadruple and the production rules
L ={x: ∃y ∈ {a, b}* : x = ay}
Answer:
G={{S, A}, {a, b}, S, P}
P: S => aA
A => aA | bA | λ
My question is :
Why there is λ for A, but there is no λ for S?
From the language definition, it is any string that begins with an a and contains only a's and b's , but why in the answer A => bA. Does not it mean that the string starts with b if it is A => bA?
Thank you so much
1. Why there is λ for A, but there is no λ for S?
λ nul can be derived from A to convert a sentimental from into sentence. Additionally according to language statement prefix sub-string y ∈ {a, b}* can be nul (a empty string) e.g. "a" is a string belongs to the language. If y contain any symbol then length of language will be more than one.
S doesn't derive λ nul because empty (or say nul string) is not in language. The smallest string in language is single "a".
2. From the language definition, it is any string that begins with an a and contains only a's and b's , but why in the answer A => bA. Does not it mean that the string starts with b if it is A => bA?
Note only strings those can derived from start variable S are included in language of grammar. You can't start derivation from A (that is not start variable). And if you start a derivation from S your string will always start with a symbol.
I suggest you to read: "Why the need for terminals? Is my solution sufficient enough?" Where I written about basic definition of formal grammar.