What is the complexity of networkx.is_isomorphic(graph1, graph2)?
I am particularly interested to know it in the case of directed acyclic graphs.
Cheers.
According to the documents of nx.is_isomorphic the vf2-algorithm is implemented and even the original scientific reference is given.
"L. P. Cordella, P. Foggia, C. Sansone, M. Vento, “An Improved Algorithm for Matching Large Graphs”, 3rd IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition, Cuen, pp. 149-159, 2001."
The boost library states for the vf2-algorithm the following complexity:
"The spatial complexity of VF2 is of order O(V), where V is the (maximum) number of vertices of the two graphs. Time complexity is O(V^2) in the best case and O(V!·V) in the worst case."
Related
I have a combinatorial optimization problem for which I have a genetic algorithm to approximate the global minima.
Given X elements find: min f(X)
Now I want to expand the search over all possible subsets and to find the one subset where its global minimum is maximal compared to all other subsets.
X* are a subset of X, find: max min f(X*)
The example plot shows all solutions of three subsets (one for each color). The black dot indicates the highest value of all three global minima.
image: solutions over three subsets
The main problem is that evaluating the fitness between subsets runs agains the convergence of the solution within a subset. Further the solution is actually a local minimum.
How can this problem be generally described? I couldn't find a similar problem in the literature so far. For example if its solvable with a multi-object genetic algorithm.
Any hint is much appreciated.
While it may not always provide exactly the highest minima (or lowest maxima), a way to maintain local optima with genetic algorithms consists in implementing a niching method. These are ways to maintain population diversity.
For example, in Niching Methods for Genetic Algorithms by Samir W. Mahfoud 1995, the following sentence can be found:
Using constructed models of fitness sharing, this study derives lower bounds on the population size required to maintain, with probability gamma, a fixed number of desired niches.
If you know the number of niches and you implement the solution mentioned, you could theoretically end up with the local optima you are looking for.
Thanks for your willingness to help.
Straight to the point, I'm confused with the use of Big O notation when analyzing the worst case time complexity of search algorithms.
For example, the worst case time complexity of Alpha-Beta Pruning is O(b^d) where ^ means ~ to the power of ~, b representing the average branching factor and d representing the depth of the search tree.
I do get that the worst case time complexity would be less or equal to a positive constant multiplied by b^d, but why is the use of big O notation permitted here? Where did the variable n, the input size, go? I do know that the input of same size might cause significant difference in time complexity of an algorithm.
All of the research I've done only explains "the use of big o notation in the analysis of worst case time complexity" in terms of the growth function, a function that has variable y as time complexity and variable x as input size. There are also formal definitions of big o notation, which make me even more confused with the question above. definition 1definition 2
Any attempts to answer my question would be greatly appreciated.
The input size you refer here to n is in this case d. If n is the amount of entries in your tree, d can be calculated by ln_2(n), assuming your tree is a balanced binary tree.
Big O notation implies that you are discussing what the runtime would be for a very large n. In the case you noted, O(b^d), the n is the variable that changes with input size. In this case, d would be your n. As you've found, some notations make use of many variables.
n is just a general term for the number of elements, but runtime could vary on many factors- depth of a tree, or a different list entirely. For example, to traverse lists like this:
for n in firstList:
for k in secondList:
do stuff
the cost would be O(n*k).
I'm new to genetic algorithms and am writing code for the Traveling Salesman problem. I'm using cycle crossover to generate new offspring and I've found that this leads to some of the offspring retaining the same exact phenotype as one parent even when the two parents are different. Would translating the chromosomes avoid this?
By translate I mean a chromosome with phenotype ABCDE shifting over two to DEABC. They would be equivalent answers and have equal fitness, but might make more diverse offspring.
Is this worth it in the long run, or is it just wasting computing time?
Cycle crossover (CX) is based on the assumption that it's important to preserve the absolute position of cities (a city preferably inherits its position from either parent) and the preventive "translation" is against the spirit of CX.
Anyway multiple studies (e.g. 1) have shown that for TSP the key is to preserve the relative position of cities and the edges.
So it could work, but you have to experiment. Some form of mutation is another possibility.
Probably, if the characteristics of CX aren't satisfying, a different crossover operator is a better choice: staying with simple operators, one of the most successful is Order Crossover (e.g. 2).
L. Darrell Whitley, Timothy Starkweather, D'Ann Fuquay - Scheduling problems and traveling salesmen: The genetic edge recombination operator - 1989.
Pablo Moscato - On Genetic Crossover Operators for Relative Order Preservation.
Suppose we have finite data set {x_i, y_i}.
I am looking for an efficient data structure for the data set, such that given a,b it will be possible to find efficiently x,y such that x > a, y > b and x*y is minimal.
Can it be done using a red black tree ?
Can we do it in complexity O(log n)?
Well, without a preprocessing step, of course you can't do it in O(log n) with any data structure, because that's not enough time to look at all the data. So I'll assume you mean "after preprocessing".
Red-black trees are intended for single-dimension sorting, and so are unlikely to be of much help here. I would expect a kD-tree to work well here, using a nearest-neighbor-esque query: You perform a depth-first traversal of the tree, skipping branches whose bounding rectangle either violates the x and y conditions or cannot contain a lower product than the lowest admissible product already found. To speed things up further, each node in the kD-tree could additionally hold the lowest product among any of its descendants; intuitively I would expect that to have a practical benefit, but not to improve the worst-case time complexity.
Incidentally, red-black trees aren't generally useful for preprocessed data. They're designed for efficient dynamic updates, which of course you won't be doing on a per-query basis. They offer the same asymptotic depth guarantees as other sorted trees, but with a higher constant factor.
In Differential Evolution Algorithm for optimization problems.
There are three evolutionary processes involved, that is mutation crossing over and selection
I am just a beginner but I have tried removing the crossing over process and there is no significant difference result from the original algorithm.
So what is the importance of crossing over in Differential Evolution Algorithm?
If you don't use crossover may be your algorithm just explore the problem search space and doesn't exploit it. In general an evolutionary algorithm succeeds if it makes good balance between exploration and exploitation rates.
For example DE/rand/1/Either-Or is a variant of DE which eliminates crossover operator but uses effective mutation operator. According to Differential Evolution: A Survey of the State-of-the-Art, in this Algorithm, trial vectors that are pure mutants occur with a probability pF and those that are pure recombinants occur with a probability 1 − pF. This variant is shown to yield competitive results against classical DE-variants rand/1/bin and target-to-best/1/bin (Main Reference).
X(i,G) is the i-th target (parent) vector of Generation G, U(i,G) is it's corresponding trial vector,F is difference vector scale factor and k = 0.5*(F + 1)[in the original paper].
In this scheme crossover isn't used but mutation is effective enough to compare with original DE algorithm.