Is it possible to have more than one minimal DFA 's for a regular language? - finite-automata

If we create two DFA's for a language L say DFA A and DFA B. Then, after minimising the DFA's we get their corresponding equivalent minimal DFA's . Is It always the case that both minimal DFA's have same number of states?
I designed 2 DFAs for a language containing strings with 1 as their second last symbol. (The alphabet is {0,1}
I made 2 DFA's one has 3 states one Has four. I am unable to minimise any of the two.

The minimum deterministic finite automata is unique up to isomorphism.
Isomorphism effectively means "equal shape". With other words, there is only one minimum DFA and you can name the states as you want, this renaming technically creates a new automata, but all of this different possible renaming of the states are isomorphic to each other - the shape is the same, just the representation is different.
Ignoring the isomorphism, the minimum DFA is unique.


How can one define a language which does not fit in the Chomsky Hierarchy?

I'm asking this question because I've stumbled across the accepted answer of Chomsky Language Types
This quote is referring to Type-0 Grammars:
This means that if you have a language that is more expressive than
this type (e.g. English), you cannot write an algorithm that can list
each an every (and only these) words of the language
As far as I know:
There is no mathematical description for what English is so it is meaningless to argue about where it lands in the hierarchy of formal languages.
If there was, then English would certainly be recognizable by some Type-0 Grammar by virtue of it being defined by a finite amount of reasoning - where it be axioms, a grammar, anything. (If not - how could've someone define it if not by a finite amount of steps?)
We can't start talking about how 'expressive' a grammar needs to be to generate precisely an unknown mathematical object
Therefore my problem:
How can one define a language which does not fit in the Chomsky Hierarchy?
If (?) it takes a finite amount of steps for mathematicians to define
sets with cardinalities that do not make them recursively enumerable - then grammars must exist which are more expressive than Type-0 since they (mathematicians) have followed a finite amount of rules (production rules if you will) to produce a non-RE set. Where are they?
A language is a possibly-infinite set of finite words written with some finite alphabet. Since the alphabet is finite and the length of each word is finite, the words of any language are enumerable, in the sense that there exists an enumeration. In other words, the size of any language is at most countably infinite.
However, since any subset of the Kleene closure of the alphabet is a language, the number of languages is not countably infinite. Hence, there is no enumeration of languages.
The Chomsky hierarchy is based on a formalism which can be expressed as a finite sentence with a finite alphabet (the same alphabet as the language being described, plus a couple of extra symbols). [Note 1] So the number of possible Type 0 grammars is countably infinite, and there cannot be a correspondence between the set of grammars and the set of languages.
However. The existence of languages (i.e. sets) for which no generative grammar exists does not necessarily mean that there is some other way of describing these languages which is "more expressive" than generative grammars. Any description which can be written as a finite string using a finite alphabet can only describe a countable infinity of sets. Whether or not it is the same countable infinity will depend on the formalisms, and in general there will be no algorithm which can demonstrate homomorphism. But some equivalences are known (such as the equivalence with Turing machines, which is a particularly interesting equivalence).
So, we have an interesting little conundrum, which is (of course) related to Gödel's Incompleteness Theorems. That is, there are more languages than ways of describing a language, no matter what system we use to describe a language. So the question "How do we describe a language for which no description is available?" does not have a good answer (and if we answer it, by calling some set "Sue", then there will still be an uncountable infinitude of possible sets for which no name exists).
While all this foraging into infinitudes is interesting, it has a few issues:
It has very little (if anything) to do with programming, so it's questionable whether it's on topic for StackOverflow.
Kurt Gödel and Georg Cantor, the two mathematicians responsible for most of the concepts in this answer, both suffered from severe depression. Just saying.
Although at first glance it might appear that the alphabet for a Type 0 grammar might be arbitrarily larger than the alphabet of the language being described, that is not actually the case. The grammar's alphabet consists of the target alphabet plus a finite set of non-terminals plus an → symbol; the non-terminals can be written using numbers in any convenient base, say binary. So only three additional symbols are required (and you could reduce that to two by arbitrarily designating one of the non-terminal numbers to be the arrow). (It might seem like you need a third symbol to delimit the names of non-terminals, but you can use a fibonacci encoding to produce codes which always start with a 1 and never include two 1s, so that you can use an extra 1 at the beginning to unambiguously mark the start of the symbol.)

Are there any steps or rules to draw a DFA?

In my first lecture of "Theory of Automata", after giving some concepts of Alphabet, Language, transition function etc. and a couple of simple automata of an electric circuit with one and two switches, is this question.
I understand what an Alphabet as well as the Language of a DFA is, but are there any rules or steps to followed to reach a correct automaton for a given Language? Or we just have to imagine and think in our mind and get to a solution which satisfies the given Language?
Note:- Please keep your language as simple as you can, since this is my first lecture and I am not yet aware of concepts like regular expressions or any other thing in the subject for that matter.
If you are given a description of the language in words, say, think about all the possible strings that can apply to this language. Then, try to come up with a DFA that handles most of the strings. Then look into the boundary conditions and generate some strings. Try to accommodate it in the DFA. This might be a good starting point for you
I am a novice .. but as per my experience.. the boundary conditions of the language to be accepted should be drawn 1st and then the complexities can be added while looking at the conditions which will get rejected step by step ... as a start if the figure in the question would have been for a DFA which accepts L={01*0}, then the bare minimum string would be "010" ..and eventually the dfa can be constructed keeping in mind the trap states and some analysis Hope this helps !!
Steps To Construct DFA-
Following steps are followed to construct a DFA for Type-01 problems-
Step-01: Determine the minimum number of states required in the DFA.
Draw those states.
Step-02: Decide the strings for which DFA will be constructed.
Step-03: Construct a DFA for the strings decided in Step-02.
Step-04: Send all the left possible combinations to the starting state.
Do not send the left possible combinations over the dead state.

Impossible Finite State Automata?

I fully believe this non deterministic FSA is not possible from all of my attempts. The FSA (Non deterministic): A language is made up of an alphabet of only 2's and 3's within strings that have only an odd number of digits (223, 32232) and the sum of the digits must be divisible by 5. (Final inclusion examples: 22222, 33333, 2222322).
Would someone be able to construct this non deterministic FSA with acceptance states graphically? I would be very impressed because from all of both my attempts and also a colleague of mine, the only result is that it cannot be done.
First, I think it is called NFA not FSA. Any regular expression can be converted to an NFA. But not all languages can be specified by REs. Yours may be such an example. Here two simple examples: REs cannot be used to check whether parentheses are balanced or to check whether a string has an equal number of A's and B's. So if you can find an RE for your problem, you are done.

Simulating regular expressions with deterministic finite automata and the size of the alphabet

I'm currently working my way through the "Dragon Book" (Compilers: Principles, Techniques, & Tools) and I'm kind of stuck at the lexical analysis chapter, which uses DFAs (Deterministic finite automata).
A DFA is a two-dimensional array, the first dimension contains the state and the second the transition symbols. This means that every DFA state contains all the symbols of the language. The examples in the book use a small language (usually two symbols), and they make the following note at the end of the chapter: "since a typical lexical analyzer has several hundred states in its DFA and involves the ASCII alphabet of 128 input characters, the array consumes less than a megabyte".
However, for matching strings I want to match all characters, which means the entire character set, and a lot of input files use UTF-8 encoding. This causes the alphabet, and thus the size of the DFA, to rise enormously.
This is the point where I'm stuck. How are lexical analyzers, or regular expression simulators in general handling this?
I've had an epiphany on this problem. In lexical analysis, about the only time you want to match characters beyond the ASCII range is while doing wildcard matching, like in strings or comments. Because these are only used in wildcards, and not individually, all the characters with a value of 128 or higher can be represented as a single 'other' value. The alphabet and DFA remain small this way, while I am still able to use transition tables and match the entire unicode charset.
Here's an interesting tool that converts a regular expression to non-deterministic finite automata.

Building ranking with genetic algorithm,

Question after BIG edition :
I need to built a ranking using genetic algorithm, I have data like this :
now, lets interpret a,b,c,d as names of football teams, and P(x>y) is probability that x wins with y. We want to build ranking of teams, we lack some observations P(a>d),P(a>c) are missing due to lack of matches between a vs d and a vs c.
Goal is to find ordering of team names, which the best describes current situation in that four team league.
If we have only 4 teams than solution is straightforward, first we compute probabilities for all 4!=24 orderings of four teams, while ignoring missing values we have :
and we choose the ranking with highest probability. I don't want to use any other fitness function.
My question :
As numbers of permutations of n elements is n! calculation of probabilities for all
orderings is impossible for large n (my n is about 40). I want to use genetic algorithm for that problem.
Mutation operator is simple switching of places of two (or more) elements of ranking.
But how to make crossover of two orderings ?
Could P(abcd) be interpreted as cost function of path 'abcd' in assymetric TSP problem but cost of travelling from x to y is different than cost of travelling from y to x, P(x>y)=1-P(y<x) ? There are so many crossover operators for TSP problem, but I think I have to design my own crossover operator, because my problem is slightly different from TSP. Do you have any ideas for solution or frame for conceptual analysis ?
The easiest way, on conceptual and implementation level, is to use crossover operator which make exchange of suborderings between two solutions :
CrossOver(ABcD,AcDB) = AcBD
for random subset of elements (in this case 'a,b,d' in capital letters) we copy and paste first subordering - sequence of elements 'a,b,d' to second ordering.
Edition : asymetric TSP could be turned into symmetric TSP, but with forbidden suborderings, which make GA approach unsuitable.
It's definitely an interesting problem, and it seems most of the answers and comments have focused on the semantic aspects of the problem (i.e., the meaning of the fitness function, etc.).
I'll chip in some information about the syntactic elements -- how do you do crossover and/or mutation in ways that make sense. Obviously, as you noted with the parallel to the TSP, you have a permutation problem. So if you want to use a GA, the natural representation of candidate solutions is simply an ordered list of your points, careful to avoid repitition -- that is, a permutation.
TSP is one such permutation problem, and there are a number of crossover operators (e.g., Edge Assembly Crossover) that you can take from TSP algorithms and use directly. However, I think you'll have problems with that approach. Basically, the problem is this: in TSP, the important quality of solutions is adjacency. That is, abcd has the same fitness as cdab, because it's the same tour, just starting and ending at a different city. In your example, absolute position is much more important that this notion of relative position. abcd means in a sense that a is the best point -- it's important that it came first in the list.
The key thing you have to do to get an effective crossover operator is to account for what the properties are in the parents that make them good, and try to extract and combine exactly those properties. Nick Radcliffe called this "respectful recombination" (note that paper is quite old, and the theory is now understood a bit differently, but the principle is sound). Taking a TSP-designed operator and applying it to your problem will end up producing offspring that try to conserve irrelevant information from the parents.
You ideally need an operator that attempts to preserve absolute position in the string. The best one I know of offhand is known as Cycle Crossover (CX). I'm missing a good reference off the top of my head, but I can point you to some code where I implemented it as part of my graduate work. The basic idea of CX is fairly complicated to describe, and much easier to see in action. Take the following two points:
Pick a starting point in parent 1 at random. For simplicity, I'll just start at position 0 with the "a".
Now drop straight down into parent 2, and observe the value there (in this case, "c").
Now search for "c" in parent 1. We find it at position 2.
Now drop straight down again, and observe the "h" in parent 2, position 2.
Again, search for this "h" in parent 1, found at position 7.
Drop straight down and observe the "a" in parent 2.
At this point note that if we search for "a" in parent one, we reach a position where we've already been. Continuing past that will just cycle. In fact, we call the sequence of positions we visited (0, 2, 7) a "cycle". Note that we can simply exchange the values at these positions between the parents as a group and both parents will retain the permutation property, because we have the same three values at each position in the cycle for both parents, just in different orders.
Make the swap of the positions included in the cycle.
Note that this is only one cycle. You then repeat this process starting from a new (unvisited) position each time until all positions have been included in a cycle. After the one iteration described in the above steps, you get the following strings (where an "X" denotes a position in the cycle where the values were swapped between the parents.
Just keep finding and swapping cycles until you're done.
The code I linked from my github account is going to be tightly bound to my own metaheuristics framework, but I think it's a reasonably easy task to pull the basic algorithm out from the code and adapt it for your own system.
Note that you can potentially gain quite a lot from doing something more customized to your particular domain. I think something like CX will make a better black box algorithm than something based on a TSP operator, but black boxes are usually a last resort. Other people's suggestions might lead you to a better overall algorithm.
I've worked on a somewhat similar ranking problem and followed a technique similar to what I describe below. Does this work for you:
Assume the unknown value of an object diverges from your estimate via some distribution, say, the normal distribution. Interpret your ranking statements such as a > b, 0.9 as the statement "The value a lies at the 90% percentile of the distribution centered on b".
For every statement:
def realArrival = calculate a's location on a distribution centered on b
def arrivalGap = | realArrival - expectedArrival |
def fitness = Σ arrivalGap
Fitness function is MIN(fitness)
FWIW, my problem was actually a bin-packing problem, where the equivalent of your "rank" statements were user-provided rankings (1, 2, 3, etc.). So not quite TSP, but NP-Hard. OTOH, bin-packing has a pseudo-polynomial solution proportional to accepted error, which is what I eventually used. I'm not quite sure that would work with your probabilistic ranking statements.
What an interesting problem! If I understand it, what you're really asking is:
"Given a weighted, directed graph, with each edge-weight in the graph representing the probability that the arc is drawn in the correct direction, return the complete sequence of nodes with maximum probability of being a topological sort of the graph."
So if your graph has N edges, there are 2^N graphs of varying likelihood, with some orderings appearing in more than one graph.
I don't know if this will help (very brief Google searches did not enlighten me, but maybe you'll have more success with more perseverance) but my thoughts are that looking for "topological sort" in conjunction with any of "probabilistic", "random", "noise," or "error" (because the edge weights can be considered as a reliability factor) might be helpful.
I strongly question your assertion, in your example, that P(a>c) is not needed, though. You know your application space best, but it seems to me that specifying P(a>c) = 0.99 will give a different fitness for f(abc) than specifying P(a>c) = 0.01.
You might want to throw in "Bayesian" as well, since you might be able to start to infer values for (in your example) P(a>c) given your conditions and hypothetical solutions. The problem is, "topological sort" and "bayesian" is going to give you a whole bunch of hits related to markov chains and markov decision problems, which may or may not be helpful.