maximum number of edges in graph using - optimization

How can I find the maximum number of edges from a graph using only a limited number of nodes say N nodes. A selected edge uses both sides of it.

A known problem. Sounds as though your problem is known as k-densest subgraph problem (your N playing the role of k).
Bad news: This problem is known to be NP-hard, even if none of your IPs occur more than three times or the input graph is bipartite. Are you aware of any other special properties of your graph?
More bad news: From a quick look around the recent literature, it looks as though not even algorithms with good error bounds are readily available. To my mind, the problem could be nasty even with "only" 100 input nodes.
Best of luck!

Related

Verifying nearest neighbors algorithm is working correctly

I am implementing a complex algorithm for determining the n-nearest neighbors in a high dimensional embedding space from a paper I found online. After I finished the implementation, I wanted to check the results to make sure the code did indeed return the desired n-nearest neighbors. In order to to do so, I check to see if the results are equal to a brute-force search across every element in the embedding space and find the n-nearest neighbors.
The issue arises when there are multiple elements with the same distance from the query input.
For example, if I am checking for the 3-nearest neighbors, and I have four points, one of which is closest and the other 3 all equidistant from the search key, one element will necessarily be left out. I'd like to test to ensure that the two implementations are roughly the same, and I am not interested in the exact details of which elements are left out. As a result, I can't just do an element-wise equality check across the complex algorithm and the brute-force solution.
For business reasons, it is actually helpful if the element left out is random, because I want the end user to see a variety of results, as long as all results are equally relevant. I specifically do not want a stable-ordering on the results.
Is there an off-the-shelf solution for this problem? I am implementing this code in Python, but the solution can be language agnostic.

Finding best path trought strongly connected component

I have a directed graph which is strongly connected and every node have some price(plus or negative). I would like to find best (highest score) path from node A to node B. My solution is some kind of brutal force so it takes ages to find that path. Is any algorithm for this or any idea how can I do it?
Have you tried the A* algorithm?
It's a fairly popular pathfinding algorithm.
The algorithm itself is not to difficult to implement, but there are plenty of implementations available online.
Dijkstra's algorithm is a special case for the A* (in which the heuristic function h(x) = 0).
There are other algorithms who can outperform it, but they usually require graph pre-processing. If the problem is not to complex and you're looking for a quick solution, give it a try.
EDIT:
For graphs containing negative edges, there's the Bellman–Ford algorithm. Detecting the negative cycles comes at the cost of performance, though (worse than the A*). But it still may be better than what you're currently using.
EDIT 2:
User #templatetypedef is right when he says the Bellman-Ford algorithm may not work in here.
The B-F works with graphs where there are edges with negative weight. However, the algorithm stops upon finding a negative cycle. I believe that is a useful behavior. Optimizing the shortest path in a graph that contains a cycle of negative weights will be like going down a Penrose staircase.
What should happen if there's the possibility of reaching a path with "minus infinity cost" depends on the problem.

Optimize pattern of rotating holes for all combinations

Sort of a programming question, sort of a general logic question. Imagine a circular base with a pattern of circles:
And another circle, mounted above and able to rotate, with holes that expose the colored circles below:
There must be an optimal pattern of either the colored circles or the openings (or both) that will allow for all N possible combinations of colors... but I have no idea how to attack the problem! At this point, combinations of 2 seem probably the easiest and would be fine as a starting point (red/blue, red/green, red/white, etc).
I would imagine there will need to be gaps in the colors, unlike the example above. Any suggestions welcome!
Edit: clarified the question (hopefully!) thanks to feedback from Robert Harvey
For two holes, you could look for a perfect matching in a bipartite graph, each permutation described by two nodes, one in each partition. Nodes would be connected if they share one element, i.e. the (blue,red) node from the first partition connected to the (red,green) node of the second. The circles arranged in the same distance would allow for both of these patterns. A perfect matching in that graph would correspond to chains or cycles of permutations where two of them always share a single color. A bit like dominoes. If you had a set of cycles of the same length, you could interleave them to form the pattern on the lower disk. I'm not sure how easy it will be to obtain these same length cycles, though, and I also don't know how to generalize this to more than two elements in each permutation.

Travelling Salesman and Map/Reduce: Abandon Channel

This is an academic rather than practical question. In the Traveling Salesman Problem, or any other which involves finding a minimum optimization ... if one were using a map/reduce approach it seems like there would be some value to having some means for the current minimum result to be broadcast to all of the computational nodes in some manner that allows them to abandon computations which exceed that.
In other words if we map the problem out we'd like each node to know when to give up on a given partial result before it's complete but when it's already exceeded some other solution.
One approach that comes immediately to mind would be if the reducer had a means to provide feedback to the mapper. Consider if we had 100 nodes, and millions of paths being fed to them by the mapper. If the reducer feeds the best result to the mapper than that value could be including as an argument along with each new path (problem subset). In this approach the granularity is fairly rough ... the 100 nodes will each keep grinding away on their partition of the problem to completion and only get the new minimum with their next request from the mapper. (For a small number of nodes and a huge number of problem partitions/subsets to work across this granularity would be inconsequential; also it's likely that one could apply heuristics to the sequence in which the possible routes or problem subsets are fed to the nodes to get a rapid convergence towards the optimum and thus minimize the amount of "wasted" computation performed by the nodes).
Another approach that comes to mind would be for the nodes to be actively subscribed to some sort of channel, or multicast or even broadcast from which they could glean new minimums from their computational loop. In that case they could immediately abandon a bad computation when notified of a better solution (by one of their peers).
So, my questions are:
Is this concept covered by any terms of art in relation to existing map/reduce discussions
Do any of the current map/reduce frameworks provide features to support this sort of dynamic feedback?
Is there some flaw with this idea ... some reason why it's stupid?
that's a cool theme, that doesn't have that much literature, that was done on it before. So this is pretty much a brainstorming post, rather than an answer to all your problems ;)
So every TSP can be expressed as a graph, that looks possibly like this one: (taken it from the german Wikipedia)
Now you can run a graph algorithm on it. MapReduce can be used for graph processing quite well, although it has much overhead.
You need a paradigm that is called "Message Passing". It was described in this paper here: Paper.
And I blog'd about it in terms of graph exploration, it tells quite simple how it works. My Blogpost
This is the way how you can tell the mapper what is the current minimum result (maybe just for the vertex itself).
With all the knowledge in the back of the mind, it should be pretty standard to think of a branch and bound algorithm (that you described) to get to the goal. Like having a random start vertex and branching to every adjacent vertex. This causes a message to be send to each of this adjacents with the cost it can be reached from the start vertex (Map Step). The vertex itself only updates its cost if it is lower than the currently stored cost (Reduce Step). Initially this should be set to infinity.
You're doing this over and over again until you've reached the start vertex again (obviously after you visited every other one). So you have to somehow keep track of the currently best way to reach a vertex, this can be stored in the vertex itself, too. And every now and then you have to bound this branching and cut off branches that are too costly, this can be done in the reduce step after reading the messages.
Basically this is just a mix of graph algorithms in MapReduce and a kind of shortest paths.
Note that this won't yield to the optimal way between the nodes, it is still a heuristic thing. And you're just parallizing the NP-hard problem.
BUT a little self-advertising again, maybe you've read it already in the blog post I've linked, there exists an abstraction to MapReduce, that has way less overhead in this kind of graph processing. It is called BSP (Bulk synchonous parallel). It is more freely in the communication and it's computing model. So I'm sure that this can be a lot better implemented with BSP than MapReduce. You can realize these channels you've spoken about better with it.
I'm currently involved in an Summer of Code project which targets these SSSP problems with BSP. Maybe you want to visit if you're interested. This could then be a part solution, it is described very well in my blog, too. SSSP's in my blog
I'm excited to hear some feedback ;)
It seems that Storm implements what I was thinking of. It's essentially a computational topology (think of how each compute node might be routing results based on a key/hashing function to the specific reducers).
This is not exactly what I described, but might be useful if one had a sufficiently low-latency way to propagate current bounding (i.e. local optimum information) which each node in the topology could update/receive in order to know which results to discard.

Minimizing pen lifts in a pen plotter or similar device

I'm looking for references to algorithms for plotting on a mechanical pen plotter.
Specifically, I have a list of straight vectors, each representing a line to be plotted. First I want to remove duplicate vectors, so each line is only plotted once. That's easy enough.
Second, there are many vectors that intersect, sometimes at endpoints, but not always. They can be plotted in any order, but I want to find an order that reduces the number of times the pen must be lifted, preferably to a minimum though I understand that may take a long time to compute, if it's computable at all. Vectors that intersect can be broken into smaller vectors if that helps. But generally, if the pen is moving in a straight line, it's best to keep it moving that way as long as possible. So, two parallel vectors joined end to end could be combined into a single vector, etc.
This sounds like some variety of graph theory problem, but I don't know much about that. Can anyone point me to references or algorithms I need to study? Or maybe example code?
Thanks,
Neil
The problem is an example of the Chinese postman problem which is an NP-complete problem. The most wellknown NP-complete problem is the Travelling Salesman. Common for all NP-complete problems are that they can all be translated into eachother. There are no known algorithms for solving any of them in a time that is polynomial dependent of the number of nodes in the input, they are non-polynomial (NP).
For your case I would suggest some simple heuristics. Don't overdo it, just pick anything quite simple like going in a straight line as long as possible and then lift the pen to the closest available starting point and go on from there.