BNF grammar specification - grammar

Write a BNF specification where each string in the language starts with x’s followed by y’s followed by z’s, with the following constraint: number of occurrences of y is greater than the number of occurrences of x and z (|y| > |x|+|z|).

I figured it out eventually.
BNF specification :
<S"> --> <A'> <C'>
<A'> --> x<A'>yy | xyy
<C'> --> yy<C'>z | yyz

Related

Finite Automata string not ending with ba

Question: Build an FA that accepts only those words that do not end with ba.
I want to Draw DFA for this problem but I don't understand I to do it please help me to draw this
Steps:
Draw DFA which ends with "ba".
Invert the states i.e.
Make the final states, non final.
Non final states, final states
IMAGE: DFA of strings not ending with "ba":
RE for a language that do not end on ba is (a+b)*(aa+bb+ab)
here language either ends on aa or bb or ab
to make DFA from RE you can use this
hope it would proved helpful for you
https://cyberzhg.github.io/toolbox/nfa2dfa
in this given DFA ..it is accepting strings with length 2 or greater than 2 but not ending on ba
We need to keep track of whether we have seen substrings of ba and if we see the whole thing, make sure we're not in an accepting state at the time.
----->(q0)--b-->(q1)--a-->(q2)
Here, (q0) is accepting, (q1) is accepting and (q2) is not accepting. (q0) corresponds to having seen no part of the string ba, state (q1) to having seen the first symbol, and (q2) to having seen the whole thing. The missing transitions should therefore be:
q0 to q0 on symbol a, since if we haven't started seeing ba, a is no help; we needed a b
q1 to q1 on symbol b, since if we see b we have always at least seen the first symbol in ba
q2 to q0 on symbol a and to q1 on symbol b, for the above reasons.
The whole DFA looks like this:
/--|--b----\
b | |
| V |
----->(q0)--b-->(q1)--a-->(q2)
| ^ |
a | |
\--|-----------------/

Matching a string which includes -,.$\/ with a regex

I am trying to match a string which includes -,.$/ ( and might include other special characters which I don't know yet( with a regex . I have to match first 28 characters in the string
The String is -->
Received - Data Migration 1. Units, of UNITED STATES $ CXXX CORPORATION COMMON SHARE STOCK CERTIFICATE NO. 323248 987,837 SHARES PAR VAL $1.00 NOT ADMINISTERED XX XX, XXXSFHIGSKF/XXXX PURPOSES ONLY
The regex I am using is ((([\w-,.$\/]+)\s){28}).*
Is there a better way to match special characters ?
Also I get an error if the string length is less than 28. What can I do to include the range so that the regex works even if the string is less than 28 characters
the code looks something like this
Select regexp_extract(Txn_Desc,'((([\w-,.$;!#\/%)^#<>&*(]+)\s){1,28}).*',1) as Transaction_Short_Desc,Txn_Desc
from Table x
It seems you are looking for 28 tokens.
Try
(\S+\s+){0,28}
or
([^ ]+ +){0,28}
This is the result for 8 tokens:
Received - Data Migration 1. Units, of UNITED
| | | | | | | |
1 2 3 4 5 6 7 8

Construct grammar given the following language {a^n b^m | n,m = 0,1,2,...,n <= 2m} [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
I just took my midterm but couldn't answer this question.
Can someone please give a couple of examples of the language and construct a grammar for the language or
at least show me how i will go about it?
Also how to write grammar for L:
L = {an bm | n,m = 0,1,2,..., n <= 2m } ?
Thanks in advance.
How to write grammar for formal language?
Before read my this answer you should read first: Tips for creating Context free grammars.
Grammar for {an bm | n,m = 0,1,2,..., n <= 2m }
What is you language L = {an bm | n,m = 0,1,2,..., n <= 2m } description?
Language description:
The language L is consist of set of all strings in which symbols a followed by symbols b, where number of symbol b are more than or equals to half of number of a's.
To understand more clearly:
In pattern an bm, first symbols a come then symbol b. total number of a 's is n and number of b's is m. The inequality equation says about relation between n and m. To understand the equation:
given: n <= 2m
=> n/2 <= m means `m` should be = or > then n/2
=> numberOf(b) >= numberOf(a)/2 ...eq-1
So inequality of n and m says:
numberOf(b) must be more than or equals to half of numberOf(a)
Some example strings in L:
b numberOf(a)=0 and numberOf(b)=1 this satisfy eq-1
bb numberOf(a)=0 and numberOf(b)=2 this satisfy eq-1
So in language string any number of b are possible without a's. (any string of b) because any number is greater then zero (0/2 = 0).
Other examples:
m n
--------------
ab numberOf(a)=1 and numberOf(b)=1 > 1/2
abb numberOf(a)=1 and numberOf(b)=2 > 1/2
abbb numberOf(a)=1 and numberOf(b)=3 > 1/2
aabb numberOf(a)=2 and numberOf(b)=2 > 2/2 = 1
aaabb numberOf(a)=3 and numberOf(b)=2 > 3/2 = 1.5
aaaabb numberOf(a)=4 and numberOf(b)=2 = 4/2 = 2
Points to be note:
all above strings are possible because number of b's are either equal(=) to half of the number of a or more (>).
and interesting point to notice is that total a's can also be more then number of b's, but not too much. Whereas number of b's can be more then number of a's by any number of times.
Two more important case are:
only a as a string not possible.
note: null ^ string is also allowed because in ^ , numberOf(a) = numberOf(b) = 0 that satisfy equation.
At once, it look that writing grammar is tough but really not...
According to language description, we need following kinds of rules:
rule 1: To generate ^ null string.
N --> ^
rule 2: To generate any number of b
B --> bB | b
Rule 3: to generate a's:
(1) Remember you can't generate too many a's without generating b's.
(2) Because b's are more then = to half of a's; you need to generate one b for every alternate a
(3) Only a as a string not possible so for first (odd) alternative you need to add b with an a
(4) Whereas for even alternative you can discard to add b (but not compulsory)
So you overall grammar:
S --> ^ | A | B
B --> bB | b
A --> aCB | aAB | ^
C --> aA | ^
here S is start Variable.
In the above grammar rules you may have confusion in A --> aCB | aAB | ^, so below is my explanation:
A --> aCB | aAB | ^
^_____^
for second alternative a
C --> aA <== to discard `b`
and aAB to keep b
let us we generate some strings in language using this grammar rules, I am writing Left most derivation to avoid explanation.
ab S --> A --> aCB --> aB --> ab
abb S --> A --> aCB --> aB --> abB --> abb
abbb S --> A --> aCB --> aB --> abB --> abB --> abbB --> abbb
aabb S --> A --> aAB --> aaABB --> aaBB --> aabB --> aabb
aaabb S --> A --> aCB --> aaAB --> aaaABB --> aaaBB --> aaabB --> aaabb
aaaabb S --> A --> aCB --> aaAB --> aaaCBB --> aaaaABB --> aaaaBB
--> aaaabB
--> aaaabb
One more for non-member string:
according to language a5 b2 = aaaaabb is not possible. because 2 >= 5/2 = 2.5 ==> 2 >= 2.5 inequality fails. So we can't generate this string using grammar too. I try to show below:
In our grammar to generate extra a's we have to use C variable.
S --> A
--> aCB
--> aaAB
--> aa aCB B
--> aaa aA BB
--> aaaa aCB BB
---
^
here with first `a` I have to put a `b` too
While my answer is done but I think you can change A's rules like:
A --> aCB | A | ^
Give it a Try!!
EDIT:
as #us2012 commented: It would seem to me that then, S -> ^ | ab | aaSb | Sb would be a simpler description. I feel this question would be good for OP and other also.
OP's language:
L = {an bm | n,m = 0,1,2,..., n <= 2m}.
#us2012's Grammar:
S -> ^ | ab | aaSb | Sb
#us2012's question:
Whether this grammar also generates language L?
Answer is Yes!
The inequality in language between number of a's = n and number of b = m is n =< 2m
We can also understand as:
n =< 2m
that is
numberOf(a) = < twice of numberOf(b)
And In grammar, when even we add one or two a's we also add one b . So ultimately number of a can't be more then twice of number of b.
Grammar also have rules to generate. any numbers of b's and null ^ strings.
So the simplified Grammar provided by #us2012 is CORRECT and also generates language L exactly.
Notice: The first solution came from derivation as I written in am linked answer, I started with language description then tried to write some basic rules and progressively I could write complete grammar.
Whereas #us2012's answer came by aptitude, you can gain the aptitude to write grammar by reading others' solutions and writing your own for some - just like how you learn programming.

How does the Soundex function work in SQL Server?

Here's an example of Soundex code in SQL:
SELECT SOUNDEX('Smith'), SOUNDEX('Smythe');
----- -----
S530 S530
How does 'Smith' become S530?
In this example, the first digit is S because that's the first character in the input expression, but how are the remaining three digits are calculated?
Take a look a this article
The first letter of the code corresponds to the first letter of the
name. The remainder of the code consists of three digits derived from
the syllables of the word according to the following code:
1 = B, F, P, V
2 = C, G, J, K, Q, S, X, Z
3 = D, T
4 = L
5 = M,N
6 = R
The double letters with the same Soundex code, A, E, I, O, U, H, W, Y,
and some prefixes are being disregarded...
So for Smith and Smythe the code is created like this:
S S -> S
m m -> 5
i y -> 0
t t -> 3
h h -> 0
e -> -
What is Soundex?
Soundex is:
a phonetic algorithm for indexing names by sound, as pronounced in English; first developed by Robert C. Russell and Margaret King Odell in 1918
How does it Work?
There are several implementations of Soundex, but most implement the following steps:
Retain the first letter of the name and drop all other occurrences of vowels and h,w:
|a, e, i, o, u, y, h, w | → "" |
Replace consonants with numbers as follows (after the first letter):
| b, f, p, v | → 1 |
| c, g, j, k, q, s, x, z | → 2 |
| d, t | → 3 |
| l | → 4 |
| m, n | → 5 |
| r | → 6 |
Replace identical adjacent numbers with a single value (if they were next to each other prior to step 1):
| M33 | → M3 |
Cut or Pad with zeros or cut to produce a 4 digit result:
| M3 | → M300 |
| M34123 | → M341 |
Here's an interactive demo in jsFiddle:
And here's a demo in SQL using SQL Fiddle
In SQL Server, SOUNDEX is often used in conjunction with DIFFERENCE, which is used to score how many of the resulting digits are identical (just like the game mastermind†), with higher numbers matching most closely.
What are the Alternatives?
It's important to understand the limitations and criticisms of soundex and where people have tried to improve it, notably only being rooted in English pronunciation and also discards a lot of data, resulting in more false positives.
Both Metaphone & Double Metaphone still focus on English pronunciations, but add much more granularity to the nuances of speech in Enlgish (ie. PH → F)
Phil Factor wrote a Metaphone Function in SQL with the source on github
Soundex is most commonly used on identifying similar names, and it'll have a really hard time finding any similar nicknames (i.e. Robert → Rob or Bob). Per this question on a Database of common name aliases / nicknames of people, you could incorporate a lookup against similar nicknames as well in your matching process.
Here are a couple free lists of common nicknames:
SOEMPI - name_to_nick.csv | Github
carltonnorthern - names.csv | Github
Further Reading:
Fuzzy matching using T-SQL
SQL Server – Do You Know Soundex Functions?

How can I construct a grammar that generates this language?

I'm studying for a finite automata & grammars test and I'm stuck with this question:
Construct a grammar that generates L:
L = {a^n b^m c^m+n|n>=0, m>=0}
I believe my productions should go along this lines:
S->aA | aB
B->bB | bC
C->cC | c Here's where I have doubts
How can my production for C remember the numbers of m and n? I'm guessing this must rather be a context-free grammar, if so, how should it be?
Seems like it should be like:
A->aAc | aBc | ac | epsilon
B->bBc | bc | epsilon
You need to force C'c to be counted during construction process. In order to show it's context-free, I would consider to use Pump Lemma.
S -> X
X -> aXc | Y
Y -> bYc | e
where e == epsilon and X is unnecessary but
added for clarity
Yes, this does sound like homework, but a hint:
Every time you match an 'a', you must match a 'c'. Same for matching a 'b'.
S->aSc|A
A->bAc|λ
This means when ever you get a at least you have 1 c or if you get a and b you must have 2 c.
i hope it has been helpful
Well guys, this is how I'll do it:
P={S::=X|epsilon,
X::=aXc|M|epsilon,
M::=bMc|epsilon}
My answer:
S -> aAc | aSc
A -> bc | bAc
where S is the start symbol.
S-> aBc/epsilon
B-> bBc/S/epsilon
This takes care of the order of the alphabets as well