Minimum number of states needed? - finite-automata

Definition of a language L with alphabet { a } is given as following
L = { ank | k > 0 ; and n is a positive integer constant }
What is the number of states needed in a DFA to recognize L?
In my opinion it should be k+1 but I am not sure.

The language L can be recognized by a DFA with n+1 states.
Observe that the length of any string in L is congruent to 0 mod n.
Label n of the states with integers 0, 1, 2, ... n-1, representing each possible remainder. An additional state, S, is the start state. S has a single transition, to state 1. If the machine is currently in state i, on input it moves to state (i+1) mod n. State 0 is
the only accepting state. (If the empty string were part of L, we could eliminate S and make state 0 the start state).
Suppose there were a DFA with fewer than n+1 states that still recognized L. Consider the sequence of states S0, S1, ... Sn encountered while processing the string an. Sn must be an accepting state, since an is in L. But since there are fewer than n+1 distinct states in this DFA, by the pigeonhole principle there must have been some state that was visited at least twice. Removing that loop gives another path (and another accepted string), with length < n, from S0 to Sn. But L contains no strings shorter than n, contradicting our assumption. Therefore no DFA with fewer than n+1 states recognizes L.

Related

Moore Machines with n states

I am working on an end of chapter question regarding how many different Moore machines there are with n states. Where n = number of states, m = number of input letters, and q = number of output characters, is it correct that there are n*q^m possible machines? My reasoning is that for each state, each input has the possibility to lead to one of the give output characters.
A Moore machine consists of:
set of states S (n)
start state s0
input alphabet Sigma (m)
output alphabet A (q)
transition function (S x Sigma -> S)
output function (S -> A)
The number of states and input/output-characters is given.
For the start state there are n possibilities.
For the transition function, there are |S| ^ (|S| * |Sigma|) = n^(n*m) different variants.
Finally, there are |A| ^ |S| = q^n output functions
This yields in total n^(n*m+1) * q^n different Moore machines.

How can I prove this language is regular?

I'm trying to prove if this language:
L = { w={0,1}* | #0(w) % 3 = 0 } (number of 0's is divisble by 3)
is regular using the pumping lemma, but I can't find a way to do it. All other examples I got, have a simple form or let's say a more defined form such as w = axbycz etc.
I don't think you can use pumping lemma to prove that a language is regular. To prove a language is regular, you just need to give a regular expression or a DFA. In this case the regular expression is quite easy:
1*(01*01*01*)*
(proof: the regular expression clearly does not accept any string which has the number of 0's not divisible by 3, so we just need to prove that all possible strings which has the number of 0's divisible by 3 is accepted by this regular expression, which can be done by confirming that for strings that contain 3n 0's, the regular expression matches it since 1n001n101n201n3...01n3n-201n3n-101n3n has the same number of 0's and the nk's can be substituted so that it matches the string, and that this format is clearly accepted by the regular expression)
Pumping lemma cannot be used to prove that a language is regular because we cannot set the y as in Daniel Martin's answer. Here is a counter-example, in a similar format as his answer (please correct me if I'm doing something fundamentally different from his answer):
We prove that the language L = {w=0n1p | n ∈ N, n>0, p is prime} is regular using pumping lemma as follows: note that there is at least one occurrence of 0, so we take y as 0, and we have xykz = 0n+k-11p, which still satisfy the language definition. Therefore L is regular.
But this is false, since we know that a sequence with prime-numbered length is not regular. The problem here is we cannot just set y to any character.
Any string in this language with at least three characters in it has this property: either the string has a "1" in it, or there are three "0"s in a row.
If the string contains a 1, then you can split it as in the pumping lemma and set y equal to some 1 in the string. Then obviously the strings xyz, xyyz, xyyyz, etc. are all in the language because all those strings have the same number of zeros.
If the string does not contain a 1, it contains three 0s in a row. Setting y to those three 0s, it should be obvious that xyz, xyyz, xyyyz, etc. are all in the language because you're adding three 0 characters each time, so you always have a number of 0s divisible by 3.
#justhalf in the comments is perfectly correct; the pumping lemma can be used to prove that a regular language can be pumped or that a language that cannot be pumped is not regular, but you cannot use the pumping lemma to prove that a language is regular in the first place. Mea Culpa.
Instead, here's a proof that the given language is regular based on the Myhill-Nerode Theorem:
Consider the set of all strings of 0s and 1s. Divide these strings into three sets:
E0, all strings such that the number of 0s is a multiple of three,
E1, all strings such that the number of 0s is one more than a multiple of three,
E2, all strings such that the number of 0s is two more than a multiple of three.
Obviously, every string of 0s and 1s is in one of these three sets.
Furthermore, if x and z are both strings of 0s and 1s, then consider what it means if the concatenation xz is in L:
If x is in E0, then xz is in L if and only if z is in E0
If x is in E1, then xz is in L if and only if z is in E2
If x is in E2, then xz is in L if and only if z is in E1
Therefore, in the language of the theorem, there is no distinguishing extension for any two strings in the same one of our three Ei sets, and therefore there are at most three equivalence classes. A finite number of equivalence classes means the language is regular.
(in fact, there are exactly three equivalence classes, but that isn't needed)
A language is regular if and only if some nondeterministic finite automaton recognizes it.
Automaton is a finite state machine.
We have to build an automaton that regonizes L.
For each state, thinking like:
"Where am I?"
"Where can I go to, with some given entry?"
So, for L = { w={0,1}* | #0(w) % 3 = 0 }
The possibilites (states) are:
The remainder (rest of division) is 0, 1 or 2. Which means we need three states.
Let q0,q1 and q2 be the states that represent the remainderes 0,1 and 2, respectively.
q0 is the start and final state.
Now, for "0" entries, do the math #0(w)%3 and go to the aproppriated state.
Transion functions:
f(q0, 0) = q1
f(q1, 0) = q2
f(q2, 0) = q0
For "1" entries, it just loops wherever it is, 'cause it doesn't change the machine state.
f(qx, 1) = qx
The pumping lemma proves if some language is not regular.
Here is a good book for theory of computation: Introduction to the Theory of Computation 3rd Edition
by Michael Sipser.

Minimum number of states in a DFA having '1' as the 5th symbol from right

What is the minimum number of states needed in a DFA to accept the strings having '1' as 5th symbol from right? Strings are defined over the alphabet {0,1}.
The Myhill-Nerode theorem is a useful tool for solving these sorts of problems.
The idea is to build up a set of equivalence classes of strings, using the idea of "distinguishing extensions". Consider two strings x and y. If there exists a string z
such that exactly one of xz and yz is in the language, then z is a distinguishing extension,
and x and y must belong to different equivalence classes. Each equivalence class maps to a different state in the minimal DFA.
For the language you've described, let x and y be any pair of different 5-character strings
over {0,1}. If they differ at position n (counting from the right, starting at 1), then any string z with length 5-n will be a distinguishing extension: if x has a 0 at position n,
and y has a 1 at position n, then xz is rejected and yz is accepted. This gives 25 = 32
equivalence classes.
If s is a string with length k < 5 characters, it belongs to the same equivalence class
as 0(5-k)s (i.e. add 0-padding to the left until it's 5 characters long).
If s is a string with length k > 5 characters, its equivalence class is determined by its final 5 characters.
Therefore, all strings over {0,1} fall into one of the 32 equivalence classes described above, and by the Myhill-Nerode theorem, the minimal DFA for this language has 32 states.
No of state will be 2^n where n is nth symbol from right
So 2^5=32 will be no of states

Is it possible to prove that L is a regular language?

Let L = {a^f(m) | m >= 1 } where f: Z^+ -> Z^+ is monotone increasing and complies that for all element n in Z^+ there is an m belonging to Z^+ such that f(m+1) - f(m) >= n.
Is it possible to prove that L is a regular language?
Let f(x) = 2^x. For any positive n, f(n+1) - f(n) >= n.
L = {a^f(m)} is not regular. Consider the strings a^(2^x + 1). After an FA processes such a string, the smallest string which leads to an accepting state is a^(2^x - 1), having length 2^x - 1. Therefore, a separate state will be needed for every value of x. Since there are infinitely many values of x (positive integers), no FA exists to recognize L; ergo, L is not a regular language.

Closure properties of context free languages

I have the following problem:
Languages L1 = {a^n * b^n : n>=0} and L2 = {b^n * a^n : n>=0} are
context free languages so they are closed under the L1L2 so L={a^n *
b^2n A^n : n>=0} must be context free too because it is generated by a
closure property.
I have to prove if this is true or not.
So I checked the L language and I don’t think that it is context free then I also saw that L2 is just L1 reversed.
Do I have to check if L1, L2 are deterministic?
L1={anbn : n>=0} and L2={bnan : n>=0} are both
context free.
Since context-free languages are closed under concatenation, L3=L1L2 is also context-free.
However, L3 is not the same language as L4={anb2nan : n >= 0}.
The string abbbaa is in L3, but not L4.
So is L4 context-free? If so, it must obey the pumping lemma for context-free languages.
Let p be the pumping length of L4. Choose s = apb2pap.
Then s is in L4, and |s| > p. Therefore, if L4 is context-free, we can write s
as uvxyz, with |vxy| <= p, |vy| >= 1, and uvnxynz is in L4 for
any n >= 0.
Observe the following properties of any nonempty string in L4: The count of a's equals the count of b's. There is exactly one occurrence of the substring 'ab', and exactly one
occurrence of the substring 'ba'. The length of the initial string of a's equals the length of the final string of a's.
We can use these observations to constrain the possible choices of v and y in our pumping argument for L4. Neither v nor y can contain the substring 'ab', because then, as v and y are pumped an arbitrary number of times, the output string would contain multiple occurrences of 'ab', and therefore cannot be an element of L4. The same argument applies to the substring 'ba'.
So v must be either empty, all a's, or all b's. The same applies to y.
Furthermore, if v is all a's, then y must consist of the same number of b's; otherwise, the pumped string would contain unequal numbers of a's and b's since v and y are pumped by
the same n. Likewise, if v is all b's, then y must be the same number of a's.
But if v is all a's, and y is all b's, the final string of a's is unaffected by pumping v and y, therefore the leading string of a's will no longer match the trailing string of a's.
Similarly, if v is all b's and y is all a's, the leading and trailing strings of a's will again have different lengths as v and y are pumped.
v and y cannot both be empty, since that would violate the condition |vy| >= 1 for
the CFL pumping lemma. But since we have established that |v| = |y|, it follows
that neither v nor y can be empty.
But if v cannot be empty, cannot be all a's, cannot be all b's, and cannot contain
the substrings 'ab' or 'ba', then there is no possible choice of uvxyz for which
the pumped version of s is still in L4. Therefore L4 is not context-free.
I'm not sure that it is -- note that in each of the defintions of L1 and L2, n is scoped within that definition, i.e. they are two different variables. When you combine them you should rename one, and get instead:
L = {a^n * b^n b^m * a^m : n,m>=0}
This is a very different language from your L, but it is obviously a context free one.