Minimum number of states in a DFA having '1' as the 5th symbol from right - finite-automata

What is the minimum number of states needed in a DFA to accept the strings having '1' as 5th symbol from right? Strings are defined over the alphabet {0,1}.

The Myhill-Nerode theorem is a useful tool for solving these sorts of problems.
The idea is to build up a set of equivalence classes of strings, using the idea of "distinguishing extensions". Consider two strings x and y. If there exists a string z
such that exactly one of xz and yz is in the language, then z is a distinguishing extension,
and x and y must belong to different equivalence classes. Each equivalence class maps to a different state in the minimal DFA.
For the language you've described, let x and y be any pair of different 5-character strings
over {0,1}. If they differ at position n (counting from the right, starting at 1), then any string z with length 5-n will be a distinguishing extension: if x has a 0 at position n,
and y has a 1 at position n, then xz is rejected and yz is accepted. This gives 25 = 32
equivalence classes.
If s is a string with length k < 5 characters, it belongs to the same equivalence class
as 0(5-k)s (i.e. add 0-padding to the left until it's 5 characters long).
If s is a string with length k > 5 characters, its equivalence class is determined by its final 5 characters.
Therefore, all strings over {0,1} fall into one of the 32 equivalence classes described above, and by the Myhill-Nerode theorem, the minimal DFA for this language has 32 states.

No of state will be 2^n where n is nth symbol from right
So 2^5=32 will be no of states

Related

Unnormalizing in Knuth's Algorithm D

I'm trying to implement Algorithm D from Knuth's "The Art of Computer Programming, Vol 2" in Rust although I'm having trouble understating how to implement the very last step of unnormalizing. My natural numbers are a class where each number is a vector of u64, in base u64::MAX. Addition, subtraction, and multiplication have been implemented.
Knuth's Algorithm D is a euclidean division algorithm which takes two natural numbers x and y and returns (q,r) where q = x / y (integer division) and r = x % y, the remainder. The algorithm depends on an approximation method which only works if the first digit of y is greater than b/2, where b is the base you're representing the numbers in. Since not all numbers are of this form, it uses a "normalizing trick", for example (if we were in base 10) instead of doing 200 / 23, we calculate a normalizer d and do (200 * d) / (23 * d) so that 23 * d has a first digit greater than b/2.
So when we use the approximation method, we end up with the desired q but the remainder is multiplied by a factor of d. So the last step is to divide r by d so that we can get the q and r we want. My problem is, I'm a bit confused at how we're suppose to do this last step as it requires division and the method it's part of is trying to implement division.
(Maybe helpful?):
The way that d is calculated is just by taking the integer floor of b-1 divided by the first digit of y. However, Knuth suggests that it's possible to make d a power of 2, as long as d * the first digit of y is greater than b / 2. I think he makes this suggestion so that instead of dividing, we can just do a binary shift for this last step. Although I don't think I can do that given that my numbers are represented as vectors of u64 values, instead of binary.
Any suggestions?

How can I prove this language is regular?

I'm trying to prove if this language:
L = { w={0,1}* | #0(w) % 3 = 0 } (number of 0's is divisble by 3)
is regular using the pumping lemma, but I can't find a way to do it. All other examples I got, have a simple form or let's say a more defined form such as w = axbycz etc.
I don't think you can use pumping lemma to prove that a language is regular. To prove a language is regular, you just need to give a regular expression or a DFA. In this case the regular expression is quite easy:
1*(01*01*01*)*
(proof: the regular expression clearly does not accept any string which has the number of 0's not divisible by 3, so we just need to prove that all possible strings which has the number of 0's divisible by 3 is accepted by this regular expression, which can be done by confirming that for strings that contain 3n 0's, the regular expression matches it since 1n001n101n201n3...01n3n-201n3n-101n3n has the same number of 0's and the nk's can be substituted so that it matches the string, and that this format is clearly accepted by the regular expression)
Pumping lemma cannot be used to prove that a language is regular because we cannot set the y as in Daniel Martin's answer. Here is a counter-example, in a similar format as his answer (please correct me if I'm doing something fundamentally different from his answer):
We prove that the language L = {w=0n1p | n ∈ N, n>0, p is prime} is regular using pumping lemma as follows: note that there is at least one occurrence of 0, so we take y as 0, and we have xykz = 0n+k-11p, which still satisfy the language definition. Therefore L is regular.
But this is false, since we know that a sequence with prime-numbered length is not regular. The problem here is we cannot just set y to any character.
Any string in this language with at least three characters in it has this property: either the string has a "1" in it, or there are three "0"s in a row.
If the string contains a 1, then you can split it as in the pumping lemma and set y equal to some 1 in the string. Then obviously the strings xyz, xyyz, xyyyz, etc. are all in the language because all those strings have the same number of zeros.
If the string does not contain a 1, it contains three 0s in a row. Setting y to those three 0s, it should be obvious that xyz, xyyz, xyyyz, etc. are all in the language because you're adding three 0 characters each time, so you always have a number of 0s divisible by 3.
#justhalf in the comments is perfectly correct; the pumping lemma can be used to prove that a regular language can be pumped or that a language that cannot be pumped is not regular, but you cannot use the pumping lemma to prove that a language is regular in the first place. Mea Culpa.
Instead, here's a proof that the given language is regular based on the Myhill-Nerode Theorem:
Consider the set of all strings of 0s and 1s. Divide these strings into three sets:
E0, all strings such that the number of 0s is a multiple of three,
E1, all strings such that the number of 0s is one more than a multiple of three,
E2, all strings such that the number of 0s is two more than a multiple of three.
Obviously, every string of 0s and 1s is in one of these three sets.
Furthermore, if x and z are both strings of 0s and 1s, then consider what it means if the concatenation xz is in L:
If x is in E0, then xz is in L if and only if z is in E0
If x is in E1, then xz is in L if and only if z is in E2
If x is in E2, then xz is in L if and only if z is in E1
Therefore, in the language of the theorem, there is no distinguishing extension for any two strings in the same one of our three Ei sets, and therefore there are at most three equivalence classes. A finite number of equivalence classes means the language is regular.
(in fact, there are exactly three equivalence classes, but that isn't needed)
A language is regular if and only if some nondeterministic finite automaton recognizes it.
Automaton is a finite state machine.
We have to build an automaton that regonizes L.
For each state, thinking like:
"Where am I?"
"Where can I go to, with some given entry?"
So, for L = { w={0,1}* | #0(w) % 3 = 0 }
The possibilites (states) are:
The remainder (rest of division) is 0, 1 or 2. Which means we need three states.
Let q0,q1 and q2 be the states that represent the remainderes 0,1 and 2, respectively.
q0 is the start and final state.
Now, for "0" entries, do the math #0(w)%3 and go to the aproppriated state.
Transion functions:
f(q0, 0) = q1
f(q1, 0) = q2
f(q2, 0) = q0
For "1" entries, it just loops wherever it is, 'cause it doesn't change the machine state.
f(qx, 1) = qx
The pumping lemma proves if some language is not regular.
Here is a good book for theory of computation: Introduction to the Theory of Computation 3rd Edition
by Michael Sipser.

Karatsuba and Toom-3 algorithms for 3-digit number multiplications

I was wondering about this problem concerning Katatsuba's algorithm.
When you apply Karatsuba you basically have to do 3 multiplications per one run of the loop
Those are (let's say ab and cd are 2-digit numbers with digits respectively a, b, c and d):
X = bd
Y = ac
Z = (a+c)(c+d)
and then the sums we were looking for are:
bd = X
ac = Y
(bc + ad) = Z - X - Y
My question is: let's say we have two 3-digit numbers: abc, def. I found out that we will have to perfom only 5 multiplications to do so. I also found this Toom-3 algorithm, but it uses polynomials I can;t quite get. Could someone write down those multiplications and how to calculate the interesting sums bd + ae, ce+ bf, cd + be + af
The basic idea is this: The number 237 is the polynomial p(x)=2x2+3x+7 evaluated at the point x=10. So, we can think of each integer corresponding to a polynomial whose coefficients are the digits of the number. When we evaluate the polynomial at x=10, we get our number back.
What is interesting is that to fully specify a polynomial of degree 2, we need its value at just 3 distinct points. We need 5 values to fully specify a polynomial of degree 4.
So, if we want to multiply two 3 digit numbers, we can do so by:
Evaluating the corresponding polynomials at 5 distinct points.
Multiplying the 5 values. We now have 5 function values of the polynomial of the product.
Finding the coefficients of this polynomial from the five values we computed in step 2.
Karatsuba multiplication works the same way, except that we only need 3 distinct points. Instead of at 10, we evaluate the polynomial at 0, 1, and "infinity", which gives us b,a+b,a and d,d+c,c which multiplied together give you your X,Z,Y.
Now, to write this all out in terms of abc and def is quite involved. In the Wikipedia article, it's actually done quite nicely:
In the Evaluation section, the polynomials are evaluated to give, for example, c,a+b+c,a-b+c,4a+2b+c,a for the first number.
In Pointwise products, the corresponding values for each number are multiplied, which gives:
X = cf
Y = (a+b+c)(d+e+f)
Z = (a+b-c)(d-e+f)
U = (4a+2b+c)(4d+2e+f)
V = ad
In the Interpolation section, these values are combined to give you the digits in the product. This involves solving a 5x5 system of linear equations, so again it's a bit more complicated than the Karatsuba case.

Number of possible binary strings of length k

One of my friends was asked this question recently:
You have to count how many binary strings are possible of length "K".
Constraint: Every 0 has a 1 in its immediate left.
This question can be reworded:
How many binary sequences of length K are posible if there are no two consecutive 0s, but the first element should be 1 (else the constrains fails). Let us forget about the first element (we can do it bcause it is always fixed).
Then we got a very famous task that sounds like this: "What is the number of binary sequences of length K-1 that have no consecutive 0's." The explanation can be found, for example, here
Then the answer will be F(K+1) where F(K) is the K`th fibonacci number starting from (1 1 2 ...).
∑ From n=0 to ⌊K/2⌋ of (K-n)Cn; n is the number of zeros in the string
The idea is to group every 0 with a 1 and find the number of combinations of the string, for n zeros there will be n ones grouped to them so the string becomes (k-n) elements long. There can be no more than of K/2 zeros as there would not have enough ones to be to the immediate left of each zero.
E.g. 111111[10][10]1[10] for K = 13, n = 3

Minimum number of states needed?

Definition of a language L with alphabet { a } is given as following
L = { ank | k > 0 ; and n is a positive integer constant }
What is the number of states needed in a DFA to recognize L?
In my opinion it should be k+1 but I am not sure.
The language L can be recognized by a DFA with n+1 states.
Observe that the length of any string in L is congruent to 0 mod n.
Label n of the states with integers 0, 1, 2, ... n-1, representing each possible remainder. An additional state, S, is the start state. S has a single transition, to state 1. If the machine is currently in state i, on input it moves to state (i+1) mod n. State 0 is
the only accepting state. (If the empty string were part of L, we could eliminate S and make state 0 the start state).
Suppose there were a DFA with fewer than n+1 states that still recognized L. Consider the sequence of states S0, S1, ... Sn encountered while processing the string an. Sn must be an accepting state, since an is in L. But since there are fewer than n+1 distinct states in this DFA, by the pigeonhole principle there must have been some state that was visited at least twice. Removing that loop gives another path (and another accepted string), with length < n, from S0 to Sn. But L contains no strings shorter than n, contradicting our assumption. Therefore no DFA with fewer than n+1 states recognizes L.