I tried to minimize this DFA through table-filling method:
So this is the table that I made for this figure:
We can conclude from this table that the figure is already minimized and we dont need to continue.
Is that correct?
We can use Myhill-Nerode to check whether the states of this DFA correspond to unique equivalence classes under the indistinguishability relation.
Strings leading to State 1 can be followed by any string in the language and is the first we've looked at, so we must have such a state in a minimal DFA.
Strings leading to State 2 cannot be followed by a to get an accepted string, so State 2 corresponds to a class distinct from State 1's.
State 3 is not accepting and so must correspond to a class distinct from State 1 and State 2.
State 4 is not accepting, so it's class cannot be the same as State 1's or State 2's. Strings leading to State 4 cannot be followed by b to get a string in the language, so State 4 is also distinct from State 3.
State 5 is not accepting either, so it is distinct from State 1 and State 2. Strings leading to State 5 can be followed by a to get a string in the language, so State 5 is also distinct from State 3 and State 4.
State 6 is not accepting, so it is also distinct from State 1 and State 2. It can be followed by either a or b to get a string in the language, so it is distinct from States 3, 4 and 5.
State 7 is accepting, so it is distinct from States 3, 4, 5 and 6. Strings leading to State 7 can befollowed by either a or aa to get a string in the language, so State 7 is distinct from States 2 and 1, respectively.
Because all of the states are distinguishable, the DFA is minimal for whatever language it accepts.
Related
L = {w | 2|w|a != 3|w|b + 2} ∪ {aaab, bbba}.
|w|a = number of a's, same for b.
How do I use the top of the stack when only the number of a's/b's counts and they can be in any order?
It is not completely clear what you mean by "use the top of the stack."
To construct a PDA may start with one for the language {w: |w|a = |w|b}.
When it reads an a it
puts an a on the stack if there is already an a or the stack is empty
removes a b from the stack
For the case of reading a b symmetrically. The PDA accepts if the stack is empty when the entire input has been read. So the stack indicates whether so far more a or more b have been read, because the majority symbol is the one that it contains.
With the factors 2 and 3 and the added 2 b it becomes a bit more complicated. I would not handle this in the stack but in the states. This means, we implement a counter for 0 or 1 a and for 0,1 or 2 b there. When we read a an input symbol x, we first try to increment the respective counter in the state. If this is possible, this is the only thing we do. If the counter is full, we set it to zero and take the action corresponding to this symbol in the PDA above for the stack.
For the +2, we count the first two b in the states before we actually start filling the counter.
The following regex within DB2 SQL works pretty well to get extra elements out of an address (i.e. not the street name or number). Limiting myself to two cases (UNIT or GATE) to keep my example simple, where HAD1 is the field containing the first line of a street address:
select HAD1,
regexp_substr(HAD1,'(UNITS?|GATES?)\s[0-9A-Z]{1,}')
from ECH
where regexp_like(HAD1,'(UNIT|GATE)')
and length(trim(HAD1)) > 12
I get this:
Ship To REGEXP_SUBSTR
Address
Line 1
UNIT 4, 117 MONTGOMORIE RD UNIT 4
END OF WAINUI RD, HIGHGATE -
UNIT 3, 37 TE ROTO DRIVE UNIT 3
GATE 6 52 MAHIA ROAD GATE 6
UNIT B 11 LANGSTONE LANE UNIT B
ASHBURTON FITTINGS GATE 2 GATE 2
GOODS: PLACEMAKERS - WESTGATE -
UNIT 3, 37 TE ROTO DRIVE UNIT 3
ASHBURTON FITTINGS GATE 2 GATE 2
SH 8A TARRAS-LUGGATE HIGHWAY GATE HIGHWAY
Which is very encouraging. It correctly didn't pick up HIGHGATE or WESTGATE because they weren't followed by a space then something else.
But it did pick up LUGGATE (last line), which I don't want. So, I'd like to be able to include that my text strings are not preceded by any character.
As you may guess I'm an absolute beginner with regex, so thank you for your patience.
Edit
Now I have my most excellent regex like so:
\b(GATE|LEVEL|DOOR|UNITS?)\s[\dA-Z]{1,}
Using it over a larger data set I notice the occasional unwanted match where, for instance, GATE is followed by an ordinary English word:
THE THIRD GATE ON THE LEFT = GATE ON
The gates, levels, doors and units that I'm looking for will always be followed by one of the following: (a) A number of up to 6 digits (b) One letter (c) A number and one letter, possibly with a dash
Examples:
UNIT 7A
GATE 6
GATE 31113
UNIT B
LEVEL B2
LEVEL 2B
UNIT D06
So, my follow up question is, can I limit the number of letters in second part of the expression to 0 or 1, but allow up to six digits.
I've played around with the numbers in curly brackets but they seem to affect only how many characters are returned rather than how many characters must be present.
Could you please help me to understand this problem:
Convert the input sequence of N (1 ≤ N ≤ 20) input numbers so that
the subsequences of the same numbers are replaced with the first
numbers of the subsequences. Each input number is in the range [1, 2
000 000 000].
For example, the input sequence 1 2 2 3 1 1 1 4 4 is converted into
1 2 3 1 4.
Input: First, the number T of test cases is given. Each test case is
specified using two lines. The first one contains the number N and the
second one contains the numbers of the sequence.
Output: The converted sequence. The result for each test case should
be printed in a separate line.
For example, the input sequence 1 2 2 3 1 1 1 4 4 is converted into 1 2 3 1 4.
It looks like the idea is to remove duplicate numbers that occur adjacent to each other when creating the output.
You can do that by just keeping a state variable recording what the previous value was. When you get a new value, compare it to the state value. If it's the same, skip. If different, output it and update the state variable. Remember to initialize the state variable to a value not found in the input stream (e.g. -1 should work in this case).
I read in a book on non-deterministic mapping there is mapping from Q*∑ to 2Q for M=(Q,∑,trans,q0,F)
where Q is a set of states.
But I am not able to understand how it's 2Q;
if there are 3 states a, b, c, how does it map to 8 states?
I always found that the easiest way to think about these (since the set of states is finite) is as having each of those subsets be an encoding of a base-2 number that ranges from 0 (all bits zero) to 2|Q|-1 (all bits one), where there are as many bits in the number as there are members in the state set, Q. Then, you can just take one of these numbers and map it into a subset by using whether a particular bit in the number is set. Easy!
Here's a worked example where Q = {a,b,c}. In this case, |Q| is 3 (there are three elements) and so 23 is 8. That means we get this if we say that the leading bit is for element a, the next bit is for b, and the trailing bit for c:
0 = 000 = {}
1 = 001 = {c}
2 = 010 = {b}
3 = 011 = {b,c}
4 = 100 = {a}
5 = 101 = {a,c}
6 = 110 = {a,b}
7 = 111 = {a,b,c}
See? That initial three states has been transformed into 8, and we have a natural numbering of them that we could use to create the labels of those states if we chose.
Now, to the interpretations of this within a non-deterministic context. Basically, the non-determinism means that we're uncertain about what state we're in. We represent this by using a pseudo-state that is the set of “real” states that we might be in; if we have total non-determinism then we are in the pseudo-state where all real-states are possible (i.e., {a,b,c}) whereas the pseudo-state where no real-states are possible (i.e., {}) is the converse (and really ought to be impossible to reach in the transition system). In a real system, you're usually not dealing with either of those extremes.
The logic of how you convert the deterministic transition system into a non-deterministic one is rather more complex than I want to go into here. (I had to read a substantial PhD thesis to learn it so it's definitely more than an SO answer's worth!)
2Q means the set of all subsets of Q. For each state q and each letter x from sigma, there is a subset of Q states to which you can go from q with letter x. So yeah, if there are three states abc the set 2Q consists of 8 elements {{}, {a}, {b}, {c}, {a,b}, {a,c}, {b,c}, {a,b,c}}. It doesn't map to 8 states, it maps to one of these 8 sets. HTH
I have a DFA question (Determinant Finite Automata) . We are using JFLAP to construct the automata. I cannot figure this question out to save my life! Here it is
"DFA to recognize the language of all strings that have an even number of zeros and an odd number of ones."
So the alphabet is {0,1} and only using 0,1. So I need to build an automata that recognizes an even number of zeros and an odd number of ones.
I don't know whether my understanding is right.
I could give you the description in Grail format that generate an even number of zeros and an odd number of ones.
START 1
1 1 2
2 1 1
1 0 3
3 0 4
4 0 3
FINAL 3