how we can find regular expression for following strings - finite-automata

Find regular expressions representing the following set:
The set of all strings over {a,b} in which the number of
occurrences of a is divided by 3.
The set of all strings over {0,1} beginning with 00

You can draw out a DFA and use that to find the regular expression.
For example, for 1., it would be
Then you use convert this into a regular expression. This is one way

For 1, you need an expression that gives every possible way of having a string over {a,b} with the occurrences of a divisible by 3. There can be 0 a's since 0 is divisible by 3. There can be 3 a's, 6 a's, 9 a's, and so on. An expression for this is (bababab)+b. The second term allows for the possibility of 0 a's and any amount of b's since 0 a's is divisible by 3. The first term accounts for all other possibilities of strings with a number of a's divisible by 3.
For 2, the set of all strings over {0,1} is (0+1)* and if it must begin with 00, then the regex is simply 00(0+1)*

Related

how to use positioning/range in regexp

I have a product code where the references always follows this pattern: XX00XX000XX. Characters 1 and 2 are always a combination of 2 letters, 3 to 4 a combination of 2 numbers, 5 to 6 letters, 7 to 10 numbers and 10 to 11 letters again (they`re always varying so it'll never be the same).
I want to do a regexp_contains (or another variant) that matches by position like; position 1 - 2 must be [[:alpha:]], 3 - 4 [[:digit:]], and so on.
(I need this to find product codes that match the reference pattern inside sell links, but I can't find any clear explanation on how to use positioning on regex statements...)
You can use character classes for this.
[a-zA-Z][a-zA-Z]\d\d[a-zA-Z][a-zA-Z]\d\d\d[a-zA-Z][a-zA-Z]
This regex contains the class [a-zA-Z] and \d, which matches letter and digit respectively. This explicitly checks, first character is a letter, second character is a letter, third character is a digit, etc.
The character classes match 1 character in the set specified, so [a-zA-Z] matches any letter, [13579] will match any odd number, etc.

Is there a SQL function statement which returns an array within limits and replaces digits with * to reduce the size

Need to write a sql function that takes two different integers with the same length (same amount of digits) as arguments: one lower threshold and one upper threshold.
And the function should return a vector with all whole numbers between the two thresholds - but to reduce the array length, your function is supposed to return a wildcard character instead of the digits where possible.
Example: output of the function for a lower threshold L = 3778 and an upper threshold U = 9423
To further clarify, the line in the example showing 941* has one digit replaced by the wild card character and hence represents all values from 9410 - 9419
The line in the example showing 93* has two digits replaced by the wildcard character and represents all values from 9300 - 9399
And so on.
9423
9422
9421
9420
941*
93*
.
378*
3779
3778

Discrete Binary Search Main Theory

I have read this: https://www.topcoder.com/community/competitive-programming/tutorials/binary-search.
I can't understand some parts==>
What we can call the main theorem states that binary search can be
used if and only if for all x in S, p(x) implies p(y) for all y > x.
This property is what we use when we discard the second half of the
search space. It is equivalent to saying that ¬p(x) implies ¬p(y) for
all y < x (the symbol ¬ denotes the logical not operator), which is
what we use when we discard the first half of the search space.
But I think this condition does not hold when we want to find an element(checking for equality only) in an array and this condition only holds when we're trying to find Inequality for example when we're searching for an element greater or equal to our target value.
Example: We are finding 5 in this array.
indexes=0 1 2 3 4 5 6 7 8
1 3 4 4 5 6 7 8 9
we define p(x)=>
if(a[x]==5) return true else return false
step one=>middle index = 8+1/2 = 9/2 = 4 ==> a[4]=5
and p(x) is correct for this and from the main theory, the result is that
p(x+1) ........ p(n) is true but its not.
So what is the problem?
We CAN use that theorem when looking for an exact value, because we
only use it when discarding one half. If we are looking for say 5,
and we find say 6 in the middle, the we can discard the upper half,
because we now know (due to the theorem) that all items in there are > 5
Also notice, that if we have a sorted sequence, and want to find any element
that satisfies an inequality, looking at the end elements is enough.

Create all combinations of a word through spaces in the Console Application

I'm trying to experiment with this, http://gyazo.com/8190a3c98a520bbeb77335e05ea5a636 (a visual basic console application). I want it to allow the user to enter in a word such, and have the console reply with it in all spaced combinations possible, so:
Say i'm using the word TEST, for example it would be created spaced out like this:
T EST
T E ST
T E S T
TE ST
TES T
T ES T
And so on... (Such as every combination it can be spaced out with multiple spaces or not)
Is this possible through the Console Application?
When counting, you start at the lowest digit. You start with that digit at zero and you count up until you reach the highest value for that digit, like this: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Then, once you reach the highest value, you have to add a second digit (e.g. 10). Then you go from lowest to highest again on the lowest digit again (e.g. 10 - 19) before incrementing the second digit again (e.g. 20). In that way, once you reach 999, you will have listed every possible combination of values in a three digit number.
When counting in binary, it works the same way, but the highest value for each digit is one, so you count up on the lowest digit like this: 0, 1. Then you have to add the second digit and count up again: 10, 11. Then you need to add a third digit (e.g. 100) and do it all again on the first two. By the time you get to 111, you will have listed every possibly combination of 1's and 0's in a three digit binary number.
So, if you think of the space between each letter as a digit in a binary number, where 0 means no space and 1 means there is a space, then all you have to do is count up from 0 to the highest value in a binary number that is the same number of digits as the length of your word, minus 1. So, for instance, with the word TEST, the the counting would look like this:
000 = TEST
001 = TES T
010 = TE ST
011 = TE S T
100 = T EST
...

SQL - Create Unique AlphaNumeric based on a 10-digit integer stored as VARCHAR

I'm trying to emulate a function in SQL that a client has produced in Excel. In effect, they have a unique, 10-digit numeric value (VARCHAR) as the primary key in one of their enterprise database systems. Within another database, they require a unique, 5-digit alphanumeric identifier. They want that 5-digit alphanumeric value to be a representation of the 10-digit number. So what they did in excel was to split the 10-digit number into pairs, then convert each of those pairs into a hexadecimal value, then stitch them back together.
The EXCEL equation is:
=IF(VALUE(MID(A2,1,4))>0,DEC2HEX(VALUE(MID(A2,3,2)))&DEC2HEX(VALUE(MID(A2,5,2)))&DEC2HEX(VALUE(MID(A2,7,2)))&DEC2HEX(VALUE(MID(A2,9,2))),DEC2HEX(VALUE(MID(A2,5,2)))&DEC2HEX(VALUE(MID(A2,7,2)))&DEC2HEX((VALUE(MID(A2,9,2)))))
I need the SQL equivalent of this. Of course, should someone out there know a better way to accomplish their goal of "a 5-digit alphanumeric identifier" based off the 10-digit number, I'm all ears.
ADDED 8/2/2011
First of all, thank you to everyone for the replies. Nice to see folks willing to help and even enjoying it! Based on all the responses, I'm apt to tell my client they're intent is sound, only their method is off kilter. I'd also like to recommend a solution. So the challenge remains, just modified slightly:
CHALLENGE: Within SQL, take a 10 digit, unique NUMERIC string and represent it ALPHANUMERICALLY in as few characters as possible. The resulting string must also be unique.
Note that the first 3-4 characters in the 10-digit string are likely to be zeros, and that they could be stripped to shorten the resulting alphanumeric string. Not required, but perhaps helpful.
This problem is inherently impossible. You have a 10 digit numeric value that you want to convert to a 5 digit alphanumeric value. Since there are 10 numeric characters, this means that there are 10^10 = 10 000 000 000 unique values for your 10 digit number. Since there are 36 alphanumeric characters (26 letters + 10 numbers), there are 36^5 = 60 466 176 unique values for your 5 digit number. You cannot map a set of 10 billion elements into a set with around 60 million.
Now, lets take a closer look at what your client's code is doing:
So what they did in excel was to split the 10-digit number into pairs, then convert each of those pairs into a hexadecimal value, then stitch them back together.
This isn't 100% accurate. The excel code never uses the first 2 digits, but performs this operation on the remaining 8. There are two main problems with this algorithm which may not be intuitively obvious:
Two 10 digit numbers can map to the same 5 digit number. Consider the numbers 1000000117 and 1000001701. The last four digits of 1000000117 get mapped to 1 11, where the last four digits of 1000001701 get mapped to 11 1. This causes both to map to 00111.
The 5 digit number may not even end up being 5 digits! For example, 1000001616 gets mapped to 001010.
So, what is a possible solution? Well, if you don't care if that 5 digit number is unique or not, in MySQL you can use something like:
hex(<NUMERIC VALUE> % 0xFFFFF)
The log of 10^10 base 2 is 33.219280948874
> return math.log(10 ^ 10) / math.log(2)
33.219280948874
> = 2 ^ 33.21928
9999993422.9114
So, it takes 34 bits to represent this number. In hex this will take 34/4 = 8.5 characters, much more than 5.
> return math.log(10 ^ 10) / math.log(16)
8.3048202372184
The Excel macro is ignoring the first 4 (or 6) characters of the 10 character string.
You could try encoding in base 36 instead of 16. This will get you to 7 characters or less.
> return math.log(10 ^ 10) / math.log(36)
6.4254860446923
The popular base 64 encoding will get you to 6 characters
> return math.log(10 ^ 10) / math.log(64)
5.5365468248123
Even Ascii85 encoding won't get you down to 5.
> return math.log(10 ^ 10) / math.log(85)
5.1829075929158
You need base 100 to get to 5 characters
> return math.log(10 ^ 10) / math.log(100)
5
There aren't 100 printable ASCII characters, so this is not going to work, as zkhr explained as well, unless you're willing to go beyond ASCII.
I found your question interesting (although I don't claim to know the answer) - I googled a bit for you out of interest and found this which may help you http://dpatrickcaldwell.blogspot.com/2009/05/converting-decimal-to-hexadecimal-with.html