Why is this Grammar not context sensitive? - grammar

I have got this grammar:
G = (N, Epsilon, P, S)
with
N = {S, A, B}
Epsilon = {a},
P: S -> e
S -> ABA
AB -> aa
aA -> aaaA
A -> a
Why is this a grammar of only type 0?
I think it is because of aA -> aaaA, but I don't see how it is in conflict with the rules.
The rules have to be built like this:
x1 A x2 -> x1 B x2 while:
A is element of N;
x1,x2 are elements of V*;
and B is element of VV*;
With V = N united Epsilon, I don't see the problem here.
a is from V, and A is from N, while right of A there could be the empty word, which would also be part of V*, so the left side would be okay.
On the right side, there is x1 again, being a, then we could say aaA is part of VV*, with aa being V and A being V*, while the right part is x2, so empty again.

"The rules have to be built like this:
x1 A x2 -> x1 B x2 while:...."
yes, it's correct. But, exists an equivalent definition of the rules (of type-1 grammars):
p->q where
p,q is element of V^+ and length(p)<=length(q) and -naturally- p has an element of N.
Your grammar has only rules, that satisfy this form => your grammar is type-1

Related

Constructing a linear grammar for the language

I find difficulties in constructing a Grammar for the language especially with linear grammar.
Can anyone please give me some basic tips/methodology where i can construct the grammar for any language ? thanks in advance
I have a doubt whether the answer for this question "Construct a linear grammar for the language: is right
L ={a^n b c^n | n belongs to Natural numbers}
Solution:
Right-Linear Grammar :
S--> aS | bA
A--> cA | ^
Left-Linear Grammar:
S--> Sc | Ab
A--> Aa | ^
As pointed out in the comments, these grammars are wrong since they generate strings not in the language. Here's a derivation of abcc in both grammars:
S -> aS -> abA -> abcA -> abccA -> abcc
S -> Sc -> Scc -> Abcc -> Aabcc -> abcc
Also as pointed out in the comments, there is a simple linear grammar for this language, where a linear grammar is defined as having at most one nonterminal symbol in the RHS of any production:
S -> aSc | b
There are some general rules for constructing grammars for languages. These are either obvious simple rules or rules derived from closure properties and the way grammars work. For instance:
if L = {a} for an alphabet symbol a, then S -> a is a gammar for L.
if L = {e} for the empty string e, then S -> e is a grammar for L.
if L = R U T for languages R and T, then S -> S' | S'' along with the grammars for R and T are a grammar for L if S' is the start symbol of the grammar for R and S'' is the start symbol of the grammar for T.
if L = RT for languages R and T, then S = S'S'' is a grammar for L if S' is the start symbol of the grammar for R and S'' is the start symbol of the grammar for T.
if L = R* for language R, then S = S'S | e is a grammar for L if S' is the start symbol of the grammar for R.
Rules 4 and 5, as written, do not preserve linearity. Linearity can be preserved for left-linear and right-linear grammars (since those grammars describe regular languages, and regular languages are closed under these kinds of operations); but linearity cannot be preserved in general. To prove this, an example suffices:
R -> aRb | ab
T -> cTd | cd
L = RT = a^n b^n c^m d^m, 0 < a,b,c,d
L' = R* = (a^n b^n)*, 0 < a,b
Suppose there were a linear grammar for L. We must have a production for the start symbol S that produces something. To produce something, we require a string of terminal and nonterminal symbols. To be linear, we must have at most one nonterminal symbol. That is, our production must be of the form
S := xYz
where x is a string of terminals, Y is a single nonterminal, and z is a string of terminals. If x is non-empty, reflection shows the only useful choice is a; anything else fails to derive known strings in the language. Similarly, if z is non-empty, the only useful choice is d. This gives four cases:
x empty, z empty. This is useless, since we now have the same problem to solve for nonterminal Y as we had for S.
x = a, z empty. Y must now generate exactly a^n' b^n' b c^m d^m where n' = n - 1. But then the exact same argument applies to the grammar whose start symbol is Y.
x empty, z = d. Y must now generate exactly a^n b^n c c^m' d^m' where m' = m - 1. But then the exact same argument applies to the grammar whose start symbol is Y.
x = a, z = d. Y must now generate exactly a^n' b^n' bc c^m' d^m' where n' and m' are as in 2 and 3. But then the exact same argument applies to the grammar whose start symbol is Y.
None of the possible choices for a useful production for S is actually useful in getting us closer to a string in the language. Therefore, no strings are derived, a contradiction, meaning that the grammar for L cannot be linear.
Suppose there were a grammar for L'. Then that grammar has to generate all the strings in (a^n b^n)R(a^m b^m), plus those in e + R. But it can't generate the ones in the former by the argument used above: any production useful for that purpose would get us no closer to a string in the language.

Generate context free grammar for the following language

**{a^i b^j c^k d^m | i+j=k+m | i<m}**
The grammar should allow the language in order abbccd not cbbcda. First should be the a's then b's and so on.
I know that you must "count" the number of a's and b's you are adding to make sure there are an equivalent number of c's and d's. I just can't seem to figure out how to make sure there are more c's than a's in the language. I appreciate any help anyone can give. I've been working on this for many hours now.
Edit:
The grammar should be Context Free
I have only got these two currently because all others turned out to be very wrong:
S -> C A D
| B
B -> C B D
|
C -> a
| b
D -> c
| d
and
S -> a S d
| A
A -> b A c
|
(which is close but doesn't satisfy the i < k part)
EDIT: This is for when i < k, not i < m. OP changed the problem, but I figure this answer may still be useful.
This is not a context free grammar, and this can be proven with the pumping lemma which states that if the grammar is context free, there exists an integer p > 0, such that all strings in the language of length >= p can be split into a substring uvwxy, where len(vx) >= 1, len(vwx) <= p, and uvnwxny is a member of the language for all n >= 0.
Suppose that a value of p exists. We can create a string such that:
k = i + 1
j = m + 1
j > p
k > p
v and x cannot contain more than one type of character or be both on the left side or both on the right side, because then raising them to powers would break the grammar immediately. They cannot be the same character as each other, because then multiplying them would break the rule that i + j = k + m. v cannot be a if x is d, because then w contains the bs and cs, which makes len(vwx) > p. By the same reasoning, v cannot be as if x is cs, and v cannot be bs if x is ds. The only remaining option is bs and cs, but setting n to 0 would make i >= k and j >= m, breaking the grammar.
Therefore, it is not a context free grammar.
There has to be at least one d because i < m, so there has to be a b somewhere to offset it. T and V guarantee this criterion before moving to S, the accepted state.
T ::= bd | bTd
U ::= bc | bUc
V ::= bUd | bVd
S ::= T | V | aSd

vector reflexivity under setoid equality using CoRN MathClasses

I have a simple lemma:
Lemma map2_comm: forall A (f:A->A->B) n (a b:t A n),
(forall x y, (f x y) = (f y x)) -> map2 f a b = map2 f b a.
which I was able to prove using standard equality (≡). Now I am need to prove the similar lemma using setoid equality (using CoRN MathClasses). I am new to this library and type classes in general and having difficulty doing so. My first attempt is:
Lemma map2_setoid_comm `{Equiv B} `{Equiv (t B n)} `{Commutative B A}:
forall (a b: t A n),
map2 f a b = map2 f b a.
Proof.
intros.
induction n.
dep_destruct a.
dep_destruct b.
simpl.
(here '=' is 'equiv'). After 'simpl' the goal is "(nil B)=(nil B)" or "[]=[]" using VectorNotations. Normally I would finish it using 'reflexivity' tactics but it gives me:
Tactic failure: The relation equiv is not a declared reflexive relation. Maybe you need to require the Setoid library.
I guess I need somehow to define reflexivity for vector types, but I am not sure how to do that. Please advise.
First of all the lemma definition needs to be adjusted to:
Lemma map2_setoid_comm : forall `{CO:Commutative B A f} `{SB: !Setoid B} ,
forall n:nat, Commutative (map2 f (n:=n)).
To be able to use reflexivity:
Definition vec_equiv `{Equiv A} {n}: relation (vector A n) := Vforall2 (n:=n) equiv.
Instance vec_Equiv `{Equiv A} {n}: Equiv (vector A n) := vec_equiv.

Unrestricted grammar

What does this general grammar do?
S -> LR
L -> L0Y
L -> LX
X1 -> 1X
X0 -> 0X
X0 -> 1Y
Y1 -> 0Y
YR -> R
L -> epsilon
R -> epsilon
the start symbol is S. I tried to generate string from this grammar and I got every binary numbers. but I think it does something specific.
S -> LR
L -> L0Y
L -> LX
X1 -> 1X
X0 -> 0X
X0 -> 1Y
Y1 -> 0Y
YR -> R
L -> epsilon
R -> epsilon
terminals: 0,1
start: S
Let's split the grammar:
S -> LR
L -> L0Y
L -> LX
This will generate a string in the form L, string of X and 0Y, R.
X1 -> 1X
X0 -> 0X
X0 -> 1Y
Y1 -> 0Y
YR -> R
Treat X and Y as acting on the binary string: X will propagate to the right, then change a 0 to 1 and all subsequent 1s to 0s. In effect, a single X increments the binary number without changing its string length (or gets stuck).
A leading Y will rewrite the string of all 1s to all 0s (or gets stuck).
Treat the rules for L as the possible actions on the right part of the string. L => L0Y will reset the string from all ones to all zeroes and increase its length by one. L => LX will increment any other number, but fails if the value is at the maximum.
These two actions together are sufficient to generate (inefficiently) all strings of zeroes and ones (including the empty string).
L -> epsilon
R -> epsilon
will only clean up the sentinels.
one possible description of the language within four words:
set of all strings

optimization of a haskell code

I write the following Haskell code which take a triplet (x,y,z) and a list of triplets [(Int,Int,Int)] and look if there is a triplet (a,b,c) in the list such that x == a and y == b if it is a case i just need to update c = c + z, if there is not a such of triplet in the list I just add the triplet in the list.
-- insertEdge :: (Int,Int,Int) -> [(Int, Int, Int)] -> [(Int, Int, Int)]
insertEdge (x,y,z) cs =
if (length [(a,b,c) | (a,b,c) <- cs, a /= x || b /= y]) == (length cs)
then ((x,y,z):cs))
else [if (a == x && b == y) then (a,b,c+1) else (a,b,c) | (a,b,c) <- cs]
After profiling my code it appears that this fuction take 65% of the execution time.
How can I re-write my code to be more efficient?
Other answers are correct, so I want to offer some unasked-for advice instead: how about using Data.Map (Int,Int) Int instead of list?
Then your function becomes insertWith (+) (a,b) c mymap
The first thing that jumps out at me is the conditional: length examines the entire list, so in the worst-case scenario (updating the last element) your function traverses the list three times: Once for the length of the filtered list, once for the length of cs, and once to find the element to update.
However, even getting rid of the extra traversals, the best you can do with the function as written will usually require a traversal of most of the list. From the name of the function and how much time was being spent in it, I'm guessing you're calling this repeatedly to build up a data structure? If so, you should strongly consider using a more efficient representation.
For instance, a quick and easy improvement would be to use Data.Map, the first two elements of the triplet in a 2-tuple as the key, and the third element as the value. That way you can avoid making so many linear-time lookups/redundant traversals.
As a rule of thumb, lists in Haskell are only an appropriate data structure when all you do is either walk sequentially down the list a few times (ideally, just once) or add/remove from the head of the list (i.e., using it like a stack). If you're searching, filtering, updating elements in the middle, or--worst of all--indexing by position, using lists will only end in tears.
Here's a quick example, if that helps:
import qualified Data.Map as M
incEdge :: M.Map (Int, Int) Int -> ((Int, Int), Int) -> M.Map (Int, Int) Int
incEdge cs (k,v) = M.alter f k cs
where f (Just n) = Just $ n + v
f Nothing = Just v
The alter function is just insert/update/delete all rolled into one. This inserts the key into the map if it's not there, and sums the values if the key does exist. To build up a structure incrementally, you can do something like foldl incEdge M.empty edgeList. Testing this out, for a few thousand random edges your version with a list takes several seconds, whereas the Data.Map version is pretty much immediate.
It's always a good idea to benchmark (and Criterion makes it so easy). Here are the results for the original solution (insertEdgeO), Geoff's foldr (insertEdgeF), and Data.Map (insertEdgeM):
benchmarking insertEdgeO...
mean: 380.5062 ms, lb 379.5357 ms, ub 381.1074 ms, ci 0.950
benchmarking insertEdgeF...
mean: 74.54564 ms, lb 74.40043 ms, ub 74.71190 ms, ci 0.950
benchmarking insertEdgeM...
mean: 18.12264 ms, lb 18.03029 ms, ub 18.21342 ms, ci 0.950
Here's the code (I compiled with -O2):
module Main where
import Criterion.Main
import Data.List (foldl')
import qualified Data.Map as M
insertEdgeO :: (Int, Int, Int) -> [(Int, Int, Int)] -> [(Int, Int, Int)]
insertEdgeO (x, y, z) cs =
if length [(a, b, c) | (a, b, c) <- cs, a /= x || b /= y] == length cs
then (x, y, z) : cs
else [if (a == x && b == y) then (a, b, c + z) else (a, b, c) | (a, b, c) <- cs]
insertEdgeF :: (Int, Int, Int) -> [(Int, Int, Int)] -> [(Int, Int, Int)]
insertEdgeF (x,y,z) cs =
case foldr f (False, []) cs of
(False, cs') -> (x, y, z) : cs'
(True, cs') -> cs'
where
f (a, b, c) (e, cs')
| (a, b) == (x, y) = (True, (a, b, c + z) : cs')
| otherwise = (e, (a, b, c) : cs')
insertEdgeM :: (Int, Int, Int) -> M.Map (Int, Int) Int -> M.Map (Int, Int) Int
insertEdgeM (a, b, c) = M.insertWith (+) (a, b) c
testSet n = [(a, b, c) | a <- [1..n], b <- [1..n], c <- [1..n]]
testO = foldl' (flip insertEdgeO) [] . testSet
testF = foldl' (flip insertEdgeF) [] . testSet
testM = triplify . M.toDescList . foldl' (flip insertEdgeM) M.empty . testSet
where
triplify = map (\((a, b), c) -> (a, b, c))
main = let n = 25 in defaultMain
[ bench "insertEdgeO" $ nf testO n
, bench "insertEdgeF" $ nf testF n
, bench "insertEdgeM" $ nf testM n
]
You can improve insertEdgeF a bit by using foldl' (55.88634 ms), but Data.Map still wins.
The main reason your function is slow is that it traverses the list at least twice, maybe three times. The function can be rewritten to to traverse the list only once using a fold. This will transform the list into a tuple (Bool,[(Int,Int,Int)]) where the Bool indicates if there was a matching element in the list and the list is the transformed list
insertEdge (x,y,z) cs = case foldr f (False,[]) cs of
(False,cs') -> (x,y,z):cs'
(True,cs') -> cs'
where f (a,b,c) (e,cs') = if (a,b) == (x,y) then (True,(a,b,c+z):cs') else (e,(a,b,c):cs')
If you haven't seen foldr before, it has type
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr embodies a pattern of recursive list processing of defining a base case and combining the current list element with the result from the rest of the list. Writing foldr f b xs is the same as writing a function g with definition
g [] = b
g (x:xs) = f x (g xs)
Sticking with your data structure, you might
type Edge = (Int,Int,Int)
insertEdge :: Edge -> [Edge] -> [Edge]
insertEdge t#(x,y,z) es =
case break (abx t) es of
(_, []) -> t : es
(l, ((_,_,zold):r)) -> l ++ (x,y,z+zold) : r
where abx (a1,b1,_) (a2,b2,_) = a1 == a2 && b1 == b2
No matter what language you're using, searching lists is always a red flag. When searching you want sublinear complexity (think: hashes, binary search trees, and so on). In Haskell, an implementation using Data.Map is
import Data.Map
type Edge = (Int,Int,Int)
type EdgeMap = Map (Int,Int) Int
insertEdge :: Edge -> EdgeMap -> EdgeMap
insertEdge (x,y,z) es = alter accumz (x,y) es
where accumz Nothing = Just z
accumz (Just zold) = Just (z + zold)
You may not be familiar with alter:
alter :: Ord k => (Maybe a -> Maybe a) -> k -> Map k a -> Map k a
O(log n). The expression (alter f k map) alters the value x at k, or absence thereof. alter can be used to insert, delete, or update a value in a Map. In short: lookup k (alter f k m) = f (lookup k m).
let f _ = Nothing
alter f 7 (fromList [(5,"a"), (3,"b")]) == fromList [(3, "b"), (5, "a")]
alter f 5 (fromList [(5,"a"), (3,"b")]) == singleton 3 "b"
let f _ = Just "c"
alter f 7 (fromList [(5,"a"), (3,"b")]) == fromList [(3, "b"), (5, "a"), (7, "c")]
alter f 5 (fromList [(5,"a"), (3,"b")]) == fromList [(3, "b"), (5, "c")]
But as ADEpt shows in another answer, this is a bit of overengineering.
In
insertEdgeM :: (Int, Int, Int) -> M.Map (Int, Int) Int -> M.Map (Int, Int) Int
insertEdgeM (a, b, c) = M.insertWith (+) (a, b) c
you want to use the strict version of insertWith, namely insertWith'.
Very small optimisation: Use an as-pattern, this avoids multiple reconstructions of the same tuple. Like this:
insertEdge xyz#(x,y,z) cs =
if (length [abc | abc#(a,b,c) <- cs, a /= x || b /= y]) == (length cs)
then (xyz:cs))
else [if (a == x && b == y) then (a,b,c+1) else abc' | abc'#(a,b,c) <- cs]
You should apply the other optimization hionts first, but this may save a very small amount of time, since the tuple doesn't have to be reconstructed again and again. At least in the last at-pattern (The first two patterns are not important, since the tuple never gets evaluated in the first case and the as-pattern is only applied once in the second case).