Time complexity of counting cycles and complete subgraphs of a given size - time-complexity

I am very new to graph theory and I am trying to understand the following. Given an undirected graph G with V vertexes and E edges, what is the time complexity of counting all the complete subgraphs of size k and what is the time complexity of counting all the cycles of length k? Basically, which one is faster? I am only considering cases where k is even. Are there any well-known references for this?

Counting all cycles of length k can, in general, be achieved in O(VE) time (V = number of vertices, E = number of edges), although this number can be greatly improved depending on k (Ref). Counting all complete subgraphs or cliques can be achieved in O(nk k2) in the general case, but again, this number can be improved upon depending on the structure of the graph and algorithm used (Ref).

Related

Time complexity of CVXOPT/MOSEK when the number of constraints is much greater than the number of variables

I have a convex quadratic programming problem:
min x^TPx+c^Tx
Ax \leq b
where P is a positive definite matrix. A is a m * n matrix and m is much greater than n, so the number of constraints is much greater than the number of variables.
My question is: 1. how to analyze the time complexity of this problem. 2. How the time complexity of the convex quadratic programming problem relates to the number of constraints.
I have tried to solve my problem using both cvxopt and mosek, the results of both show that the time complexity seems to be linear to the number of constraints.
However, when I tried to find the literature, I found that all the literatures I found only discussed how the time complexity relates to the number of variables, or assume A is a full rank matrix. I will appreciate it if you can recommend some related references. Thank you.

Graph Between Steiner Tree and Complete Graph

Given a set points P in the plane, and given a threshold t, I'd like to compute a connected graph G to minimize the sum of the lengths of its edges, subject to the following constraints:
The vertices of G contain all the points in P.
For every pair of points u and v in P, their distance in G is no greater than t times their Euclidean distance.
When t=1, this problem is solved by constructing a complete graph on P. When t is infinite (or simply large enough), this problem is the Euclidean Steiner Tree Problem.
If there already a name for this problem, I'm curious what it is. More than that, does anyone have any suggestions for how to make an algorithm for this? Since it contains the Euclidean Steiner Tree Problem as a special case, it can't be simpler, so I'm not looking for anything particularly time efficient. Thanks!

Finding Shortest Path using BFS search on a Undirected Graph, knowing the length of the SP

I was asked an interview question today and I was not able to solve at that time.
The question is to get the minimum time complexity of finding the shortest path from node S to node T in a graph G where:
G is undirected and unweighted
The connection factor of G is given as B
The length of shortest path from S to T is given as K
The first thing I thought was that in general case, the BFS is fastest way to get the SP from S to T, in O(V+E) time. Then how can we use the B and K to reduce the time. I'm not sure what a connection factor is, so I asked the interviewer, then he told me that it is on average a node has B edges with other nodes. So I was thinking that if K = 1, then the time complexity should be O(B). But wait, it is "on average", which means it could still be O(E+V), where the graph is a like a star and all other nodes are connected to S.
If we assume that the B is a strict up limit. Then the first round of BFS is O(B), and the second is O(B*B), and so on, like a tree. Some of the nodes in the lower layer may be already visited in the previous round therefore should not be added. Still, the worst scenario is that the graph is huge and none of the node has been visited. And the time complexity is
O(B) + O(B^2) + O(B^3) ... O(B^K)
Using the sum of Geometric Series, the sum is O(B(1-B^K)/(1-B)). But this SUM should not exceed V+E.
So, is the time complexity is O(Min(SUM, V+E))?
I have no idea how to correctly solve this problem. Any help is appreciated.
Your analysis seems correct. Please refer to the following references.
http://axon.cs.byu.edu/~martinez/classes/312/Slides/Paths.pdf
https://courses.engr.illinois.edu/cs473/sp2011/lectures/03_class.pdf

On what does the time complexity for graph algorithms depends on?

I stumbled over this question in my textbook:
"In general, on what does the time complexity of Prim's, Kruskal's and Dijkstra's algorithms depends on?"
a. The number of vertices in the graph.
b. The number of edges in the graph.
c. Both, on the number of vertices and edges in the graph.
Explain your choice.
So according to Wikipedia Prim's,Kruskal's and Dijkstra's algorithms worst case time complexities are O(ElogV), O(ElogV) and O(E+VlogV) respectively. So i guess the answer is c? But why?
I don't know about Prim's and Kruskal's and might be wrong about Dijkstra's but I think in its case the answer would be b because:
Dijkstra's will visit nodes on the shortest known path until it finds the destination.
This implies that if two edges point to the same node, only one will ever be considered by the algorithm since one has a higher weight or they're equal, rendering one edge moot to follow.
Therefore, the only way to increase the time spent traversing the graph by adding edges is by adding nodes (adding edges on existing node can change the algorithm's traversal time but it's not proportional to the amount of edges, only to their weights).
Therefore, my intuition is that only the amount of nodes are in direct relation with running time. The Dijkstra's Alogrithm wikipedia page seems to confirm this:
The simplest implementation of the Dijkstra's algorithm stores
vertices of set Q in an ordinary linked list or array, and extract
minimum from Q is simply a linear search through all vertices in Q. In
this case, the running time is O(E + V^2) or O(V^2).
This is only an intuition of course, and cs.stackex might be of greater use.
The answer is (c), because both V and E contribute to the asymptotic complexity of the respective algorithms. Now, on further analysis one could argue that V is much less on Kruskal's and Prim's(since it is log factor). But E seems to almost have same weights in all three cases.
Also, note that |E| <= |V|^2 always (for simple graphs)
In worst case graph will be a complete graph i.e v(v-1)/2 edges ie e>>v and e ~ v^2
Time Complexity of Prim's and Dijkstra's algorithms are:
1. With Adjacency List and Priority queue:
O((v+e) log v) in worst case: e>>v so O( e log v)
2. With matrix and Priority queue:
O(v^2 + e log v) in WC e ~ v^2
So O(v^2 + e log v) ~ O(e + e log v) ~ O(e log v).
3. When graph go denser ( Worst case is Complete Graph ) we use Fibonacci Heap
and adjacency list: O( e + v log v)
Time complexity of kruskal is O(e log e) in Worst case e ~ v^2
so log (v^2) = 2 log v
So we can safely say than O(e log e) can be O( 2e log v)
ie O( e log v) in worst case.
As you said, the time complexities of O(ElogV), O(ElogV), and O(E+VlogV) mean that each one is dependent on both E and V. This is because each algorithm involves considering the edges and their respective weights in a graph. Since for Prim’s and Kruskal’s the MST has to be connected and include all vertices, and for Dijkstra’s the shortest path has to pass from one vertex to another through other intermediary vertices, the vertices also have to be considered in each algorithm.
For example, with Dijkstra’s algorithm, you are essentially looking to add edges that are both low in cost and that connect vertices that will eventually provide a path from the starting vertex to the ending vertex. To find the shortest path, you cannot solely look for a path that connects the start vertex to the end, and you cannot solely look for the smallest weighted edges, you need to consider both. Since you are considering both edges and vertices, the time it takes to make these considerations throughout the algorithm will depend on the number of edges and number of vertices.
Additionally, different time complexities are possible through different implementations of the three algorithms, and analyzing each algorithm requires a consideration of both E and V.
For example, Prim’s algorithm is O(V^2), but can be improved with the use of a min heap-based priority queue to achieve the complexity you found: O(ElogV). O(ElogV) may seem like the faster algorithm, but that’s not always the case. E can be as large as V^2, so in dense graphs with close to V^2 edges, O(ElogV) becomes O(V^2). If V is very small then there is not much difference between O(V^2) and O(ElogV). E and V also influence the running time based on the way that the graph is being stored. For example, an adjacency list becomes very inefficient with dense graphs (with E approaching V^2) because checking to see if an edge exists in the graph goes from close to O(1) to O(V).

Optimize MATLAB code (nested for loop to compute similarity matrix)

I am computing a similarity matrix based on Euclidean distance in MATLAB. My code is as follows:
for i=1:N % M,N is the size of the matrix x for whose elements I am computing similarity matrix
for j=1:N
D(i,j) = sqrt(sum(x(:,i)-x(:,j)).^2)); % D is the similarity matrix
end
end
Can any help with optimizing this = reducing the for loops as my matrix x is of dimension 256x30000.
Thanks a lot!
--Aditya
The function to do so in matlab is called pdist. Unfortunately it is painfully slow and doesnt take Matlabs vectorization abilities into account.
The following is code I wrote for a project. Let me know what kind of speed up you get.
Qx=repmat(dot(x,x,2),1,size(x,1));
D=sqrt(Qx+Qx'-2*x*x');
Note though that this will only work if your data points are in the rows and your dimensions the columns. So for example lets say I have 256 data points and 100000 dimensions then on my mac using x=rand(256,100000) and the above code produces a 256x256 matrix in about half a second.
There's probably a better way to do it, but the first thing I noticed was that you could cut the runtime in half by exploiting the symmetry D(i,j)==D(i,j)
You can also use the function norm(x(:,i)-x(:,j),2)
I think this is what you're looking for.
D=zeros(N);
jIndx=repmat(1:N,N,1);iIndx=jIndx'; %'# fix SO's syntax highlighting
D(:)=sqrt(sum((x(iIndx(:),:)-x(jIndx(:),:)).^2,2));
Here, I have assumed that the distance vector, x is initalized as an NxM array, where M is the number of dimensions of the system and N is the number of points. So if your ordering is different, you'll have to make changes accordingly.
To start with, you are computing twice as much as you need to here, because D will be symmetric. You don't need to calculate the (i,j) entry and the (j,i) entry separately. Change your inner loop to for j=1:i, and add in the body of that loop D(j,i)=D(i,j);
After that, there's really not much redundancy left in what that code does, so your only room left for improvement is to parallelize it: if you have the Parallel Computing Toolbox, convert your outer loop to a parfor and before you run it, say matlabpool(n), where n is the number of threads to use.