I'm working on a language L = { every pair of zeros is separated by 1's that's of length 4i, i>=0 }
e.g. 110011110 should be accepted because the first two zeros are separated by nothing. Then the next pair is separated by 4 ones.
Here's my attempt for the NFA, anything missing?
We can use Myhill-Nerode directly to derive a minimal DFA for this language. The reasoning is straightforward.
e, the empty string, can be followed by any string in L to get to a string in L.
0 can be followed by 1* or by (1^4i)L to get to a string in L.
1 can be followed by L to get to a string in L. This means it is indistinguishable from the empty string. That also means we don't need to worry about longer strings that start with 1, since they will be covered by shorter strings that don't start with 1.
00 can be followed by the same stuff as 0 can, so it is indistinguishable. This also means we don't need to worry about longer strings that start with 00, since they are handled by shorter strings that don't.
01 can be followed by 1* or 111(1^4i)L to get to a string in L.
10, 11 can be ignored as they start with 1 (see 3)
000, 001 can be ignored as they start with 00 (see 4)
010 cannot be followed by anything to get a string in L. We can also ignore anything that starts with this since it can't lead to a string in L.
011 can be followed by 1* or 11(1^4i)L to get a string in L.
100, 101, 110, 111 can be ignored as they start with 1 (see 3)
0000, 0001, 0010, 0011 can be ignored as they start with 00 (see 4)
0100, 0101 can be ignored since they start with 010 (see 8)
0110 cannot be followed by anything to get to a string in L so is indistinguishable from 010.
0111 can be followed by 1* or 1(1^4i)L to get a string in L.
1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111 can all be ignored since they start with 1 (see 3)
The only string of length four distinguishable from shorter strings was 0111; 01110 is indistinguishable from 010 in that nothing leads it to a string in L, and 01111 is indistinguishable from 0 in that it can be followed by 1* or (1^4i)L to get to something in L.
This may seem like a lot of work, but it was a pretty simple exercise. We can go back through and list our complete set of shortest-length distinguishable strings:
e, from point 1 above
0, from point 2 above
01, from line 5 above
010, from point 8 above
011, from point 9 above
0111, from point 14 above
To write down a DFA, we need one state for each of these shortest-length distinguishable strings. The transitions from the state corresponding to string x will lead to states corresponding to strings formed by concatenating input symbols to x. So:
___________________________
| ^
0 V 1 1 1 | 1
--->(e)--->(0)--->(01)--->(011)--->(0111)
\_/ \_/ | 0 | 0 | 0
1 0 | V V
|<-----------------
V
(010)
\___/
0,1
e is the initial state.
the state for e loops to itself on 1 since e.1 = 1 and 1 is indistinguishable from e. Indistinguishable strings lead to the same state in a minimal DFA.
the state for e goes to the state for 0 on 0 since e.0 = 0 and 0 is distinguishable from all strings of the same or shorter length.
state 0 leads to state 01 leads to state 011 leads to state 0111 on 1s, since 0111 = 011.1 = 01.1.1 = 0.1.1.1 and these are all distinguishable from strings of the same or shorter length.
0111 leads to 0 on 1 since 0111.1 = 01111 which is indistinguishable from 0.
states 0, 01, 011 and 0111 lead to state 010 on 0 since 0.0 = 00, 01.0 = 010, 011.0 = 0110 and 0111.0 = 01110 are indistinguishable from 010.
010 leads to itself on all inputs since nothing can be added to it to get a string in L, so the same is true for any concatenation with this at the front.
Now that we have the structure, we simply have to look at each state and say whether its canonical string is in L. If so, the state is accepting; otherwise, it is not.
e is in L, so (e) is accepting.
0 is in L, so (0) is accepting.
01 is in L, so (01) is accepting.
010 is in L, so (010) is not accepting.
011 is in L, so (011) is accepting.
0111 is in L, so (0111) is accepting.
This completes the derivation of the minimal DFA for L.
Related
I have to draw a DFA that accepts set of all strings containing 1011 as a substring in it. I tried but could not come up with one. Can anyone help me please?
Thanks
The idea for a DFA that does this is simple: keep track of how much of that substring we have seen on the end of the input we've seen so far. If you eventually get to a point where the input you've seen so far ends with that substring, then you accept the whole input. If you get to the end of input before ever seeing a prefix that ends with your substring, you don't accept.
We can create the DFA by adding states as necessary to represent differing levels of match against the target substring. All DFAs need at least one state: let's call it q0.
---->q0
The implied alphabet of your language is {0, 1}, so we need transitions for both of these symbols on the state q0. Let's think about how much of the substring we will have seen in state q0. We can get to q0 with the empty string; that is, before consuming any input at all. After seeing the empty string we have seen zero of the four symbols that make up our substring. So, q0 should correspond to the case "the input I've seen up until now ends with a string that matches 0/4 of the target substring".
Given this, what transitions should we add for 0 and 1? If we see a 0 in state q0, that doesn't help at all, since the substring we're looking for begins with 1; so, seeing a 0 in q0 doesn't change the fact that the input we've seen so far matches 0/4 symbols. This means we can have the transition from q0 on 0 return to q0.
/-\
0 | |
V /
---->q0
What about if we see a 1 in q0? Well, if we see a 1, then the input we've seen so far ends with a string that matches 1011 in exactly 1/4 places (the first 1); so, we need another state to represent the fact we're a little closer to the goal. Let's call this state q1.
/-\
0 | |
V /
---->q0---->q1
1
We repeat the process now for state q1. If we see a 0 in q1, we get a little closer to our target of 1011, so we can go to a new state, q2. If we see a 1 in q1, we don't get any closer to our goal, but we also don't fall back.
0 1
/-\ /-\
| | | |
V / V /
---->q0---->q1---->q2
1 0
If we see a 0 in q2, that means we've seen the substring 00; that doesn't appear in 1011 at all, which means we are totally back to square one and must return to q0. If we see a 1 though, we get a little closer to our goal and must move to a new state; let's call this q3:
0 1
/-\ /-\
| | | |
V / V /
---->q0---->q1---->q2---->q3
^ 1 0 | 1
| |
\-------------/
0
If we see a 0 in q3 then our input has ended with the substring 10, which puts us back at q2; if we see a 1, then we have seen the whole target 1011 and need to go to a new state to remember this fact.
0 1 0
/-\ /-\ /------\
| | | | | |
V / V / V |
---->q0---->q1---->q2---->q3---->q4
^ 1 0 | 1 1
| |
\-------------/
0
Finally, in state q4, no matter what we see, we know we must accept the input since we've already seen the substring 1011 somewhere in the input. This means we should make q4 accepting and have both transitions go back to q4:
0 1 0
/-\ /-\ /------\ /---\
| | | | | | | |
V / V / V | V | 0,1
---->q0---->q1---->q2---->q3---->[q4]--/
^ 1 0 | 1 1
| |
\-------------/
0
You can check some samples to convince yourself that this DFA accepts the language you want. We built it one state at a time by asking ourselves where the transitions had to go. We stopped adding new states when new transitions didn't demand them anymore.
We want to construct a DFA for a string which contains 1011 as a substring which means it language contain
L={0,1}
which means the strings may be
{0111011,001011,11001011,........}
A string must contains 1011 has a substring.
As we observed in the transition diagram at initial state if q0 accepts 1 then move to next state otherwise remains in the same state.
If q1 accepts 0 then move to next state q2 otherwise remains in the same state.
If q2 accepts 1 then move to q3 else move to q0 because we want to substring which starts with 1 not with 0.
If q3 accepts 1 then move to q4 else which is a final state if system reaches to a final state it means a string is accepted because it contains a 1011 as a substring , if q3 accepts then back to q2.
After reaching the final state a string may not end with 1011 but it have some more words or string to be taken like in 001011110 110 is left which have to accept that's why at q4 if it accepts 0 or 1 it remains in the same state.
DFA for accepting strings with a substring 1011.
They are four transitions A,B,C,D in every construction of DFA we have to check each transaction must have both transactions otherwise it is not a DFA so that it is given to construct a DFA that accept string of odd 0's and 1's that was as shown below
A is the initial state on transition of 0 it will goes to C and
On transition 1 it will give to B
B is another state gives transition of D on 0 and A on 1 and C is a state that will give transition of D and A on transition of 1 and 0
D is final state will give transition of B and C on 0 and 1
This is the process is been done on the below figure let us check the DFA with example 1011 it has odd no of 1'sand odd no of 0 so A on 1 it will give B and B on 0 it will give D and D on 1 it will give C and C on 1 it will give D hence it is the required DFA.
The following regex within DB2 SQL works pretty well to get extra elements out of an address (i.e. not the street name or number). Limiting myself to two cases (UNIT or GATE) to keep my example simple, where HAD1 is the field containing the first line of a street address:
select HAD1,
regexp_substr(HAD1,'(UNITS?|GATES?)\s[0-9A-Z]{1,}')
from ECH
where regexp_like(HAD1,'(UNIT|GATE)')
and length(trim(HAD1)) > 12
I get this:
Ship To REGEXP_SUBSTR
Address
Line 1
UNIT 4, 117 MONTGOMORIE RD UNIT 4
END OF WAINUI RD, HIGHGATE -
UNIT 3, 37 TE ROTO DRIVE UNIT 3
GATE 6 52 MAHIA ROAD GATE 6
UNIT B 11 LANGSTONE LANE UNIT B
ASHBURTON FITTINGS GATE 2 GATE 2
GOODS: PLACEMAKERS - WESTGATE -
UNIT 3, 37 TE ROTO DRIVE UNIT 3
ASHBURTON FITTINGS GATE 2 GATE 2
SH 8A TARRAS-LUGGATE HIGHWAY GATE HIGHWAY
Which is very encouraging. It correctly didn't pick up HIGHGATE or WESTGATE because they weren't followed by a space then something else.
But it did pick up LUGGATE (last line), which I don't want. So, I'd like to be able to include that my text strings are not preceded by any character.
As you may guess I'm an absolute beginner with regex, so thank you for your patience.
Edit
Now I have my most excellent regex like so:
\b(GATE|LEVEL|DOOR|UNITS?)\s[\dA-Z]{1,}
Using it over a larger data set I notice the occasional unwanted match where, for instance, GATE is followed by an ordinary English word:
THE THIRD GATE ON THE LEFT = GATE ON
The gates, levels, doors and units that I'm looking for will always be followed by one of the following: (a) A number of up to 6 digits (b) One letter (c) A number and one letter, possibly with a dash
Examples:
UNIT 7A
GATE 6
GATE 31113
UNIT B
LEVEL B2
LEVEL 2B
UNIT D06
So, my follow up question is, can I limit the number of letters in second part of the expression to 0 or 1, but allow up to six digits.
I've played around with the numbers in curly brackets but they seem to affect only how many characters are returned rather than how many characters must be present.
I have following patterns:
13 R 2
48 B / 5
42 B
42B
303 Box 15
303 Bte 15
303 B Bt 15
and only want to have the following results (because Box 15, Bte 15 are the box numbers, and I only want the house nbr + potentially the letter attached to the house number):
13 R 2
48 B / 5
42 B
42B
303
303
303 B
Is this possible using a regular expression? I tried the following: REGEXP_SUBSTR(my_string_variable, '^\d+(\s*\w$)?'). This however only works for the patterns 3-5, and not for the first 2 and last patterns. Dropping the $ from the regex would incorrectly 'strip' the first letter for patterns 5 and 6.
I am basically assuming that if the letter behind the numeric is more than 1 character, that it belongs to the box number. For example, BTE is the French abbreviation for Boite which means Box. I realise this might be invalid if a house number has 2 letters (e.g.: 11 AA), but I would not know a solution for this and I don't think it occurs much.
This will remove: a space followed by an uppercase letter followed by at least one lowercase letter followed by an optional space followed by any number of digits:
RegExp_Replace(house_number, '\s[A-Z][a-z]+\s+\d+$')
See regex101.com
For Objective-C:
Hi everyone, I'm trying to convert a hex input into binary. For example, someone enters in :
A55
I want that to convert to
101001010101
I've tried looking through past posts and none seem to be working for me. Any help would be greatly appreciated.
Use a lookup table: there are only 16 possible characters in a HEX representation, each corresponding to a four-character binary code group. Go through the HEX character-by-character, obtain a lookup, and put it in the resultant NSString.
Here is a copy of the lookup table for you.
0 0000
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000
9 1001
A 1010
B 1011
C 1100
D 1101
E 1110
F 1111
There are multiple options as to how to do lookups. The simplest way would be making a 128-element array, and placing NSStrings at the elements corresponding to codes of the characters (i.e. at positions '0', '1', ..., 'E', 'F', with single quotes; these are very important).
I believe there is a built-in function for this. If not, you should at least be able to go hex->dec then dec->bin
You can write the conversion from scratch if you know the number of characters, bin to hex is common enough algorithmically.
A mathematical look at the algorithms
SO Answers in C/C++ Another
Base 10 to base n in Objective C
C Hex->Bin
Build a lookup table (an array where you can supply a value between 0 and 15 to get the binary for that hex digit):
char *hex_to_bin[] = {
"0000", "0001", "0010", "0011",
/* ... */
"1100", "1101", "1110", "1111"
};
There should be 16 elements in that table. The conversion process for multiple digits is to handle one digit at a time, appending the results onto the end of your result storage.
Use getchar() to read a char:
int c = getchar();
if (c < 0) { puts("Error: Invalid input or premature closure."); }
Use strchr() to determine which array index to retrieve:
char *digits = "00112233445566778899AaBbCcDdEeFf";
size_t digit = (strchr(digits, c) - digits) / 2;
Look up the corresponding binary values for digit:
printf("%s", hex_to_bin[digit]); // You'll want to use strcat here.
Using VB.Net
When i add a the number with zeros means, it is showing exact result without zero's
For Example
Dim a, b, c as int32
a = 001
b = 5
c = a + b
a = 009
b = 13
c = a + b
Showing output as 6 instead of 006, 22 instead of 022
Expected output
006
022
How to do this.
Need vb.net code help
You need to store a number as a string if you want to store the exact number of zeros. Then addition won't work though.
If you just want to display the number with 3 digits, you can store it as an integer and format the result when you print it.
c.ToString("D3")
zero is nothing.. If you do a regular mathematical calculation of 001 + 5 the result is still 6. I would suggest you check out string padding.