Finding cycles in a graph (not necessarily Hamiltonian or visiting all the nodes) - optimization

I have graph like one in Figure 1 (the first image) and want to connect the red nodes to have cycle, but cycles do not have to be Hamiltonian like Figure 2 and Figure 3 (the last two images). The problem has much bigger search space than TSP since we can visit a node twice. Like the TSP, it is impossible to evaluate all the combinations in a large graph and I should try heuristic, but the problem is that, unlike the TSP, the length of cycles or tours is not fixed here. Because, visiting all the blue nodes is not mandatory and this cause having variable length including some of the blue nodes. How can I generate a possible "valid" combination every time for evaluation? I mean, a cycle can be {A, e, B, l, k, j, D, j, k, C, g, f, e} or {A, e, B, l, k, j, D, j, i, h , g, C, g, f, e}, but not {A, e, B, l, k, C, g, f, e} or {A, B, k, C, i, D}.
Update:
The final goal is to evaluate which cycle is optimal/near optimal considering length and risk (see below). So I am not only going to minimize the length but minimizing the risk as well. This cause not being able to evaluate risk of cycle unless you know all its nodes sequence. Hope this clarifies why I can not evaluate new cycle at the middle of its generating process.
We can:
generate and evaluate possible cycles one by one;
or generate all possible cycles and then do their evaluation.
Definition of the risk:
Assume cycle is a ring which connects primary node (one of the red nodes) to all other red nodes. In case of failure in any part (edge) of the ring, no red nodes should be disconnected form the primary node (this is desired). However there are some edges we have to pass twice (due to not having Hamiltonian cycle which connects all the red nodes) and in case of failure in those edges, some of red nodes may be totally disconnected. So risk of cycle is summation of the length of risky edges (we have twice in our ring/tour) multiplied by number of red nodes we lose in case of cutting each risky edge.
A real example of 3D graph I am working on including 5 red nodes and 95 blue nodes is in below:
And here is link to Excel sheet containing adjacency matrix of the above graph (the first five nodes are red and the rests are blue).

Upon a bit more reflection, I decided it's probably better to just rewrite my solution, as the fact that you can use red nodes twice, makes my original idea of mapping out the paths between red nodes inefficient. However, it isn't completely wasted, as the blue nodes between red nodes is important.
You can actually solve this using a modified version of BFS, as more-less a backtracking algorithm. For each unique branch the following information is stored, most of which simply allows for faster rejection at the cost of more space, only the first two items are actually required:
The full current path. (list with just the starting red node)
The remaining red nodes. (initially all red nodes)
The last red node. (initially the start red node)
The set of blue nodes since last red node. (initially empty)
The set of nodes with a count of 1. (initially empty)
The set of nodes with a count of 2. (initially empty)
The algorithm starts with a single node then expands adjacent nodes using BFS or DFS, this repeats until the result is a valid tour or is the node to be expanded is rejected. So the basic psudoish code (current path and remaining red points) looks something like below. Where rn is the set of red nodes, t is the list of valid tours, p/p2 is a path of nodes, r/r2 is a set of red nodes, v is the node to be expanded, and a is a possible node to expand to.
function PATHS2HOME(G,rn)
create a queue Q
create a list t
p = empty list
v ← rn.pop()
r ← rn
add v to p
Q.enqueue((p,r))
while Q is not empty
p, r ← Q.dequeue()
if r is empty and the first and last elements of p are the same:
add p to t
else
v ← last element of p
for all vertices a in G.adjacentVertices(v) do
if canExpand(p,a)
p2 ← copy(p)
r2 ← copy(r)
add a to the end of p2
if isRedNode(a) and a in r2
remove a from r2
Q.enqueue( (p2,r2) )
return t
The following conditions prevent expansion of a node. May not be a complete list.
Red nodes:
If it is in the set of nodes that have a count of 2. This is because the red node would have been used more than twice.
If it is equal to the last red node. This prevents "odd" tours when a red node is adjacent to three other blue nodes. Thus say the red node A, was adjacent to blue nodes b, c and d. Then you would end a tour where part of the tour looks like b-A-c-A-d.
Blue nodes:
If it is in the set of nodes that have a count of 2. This is because the red node would have been used more than twice.
If it is in the set of blue nodes since last red node. This is because it would cause a cycle of blue nodes between red nodes.
Possible optimizations:
You could map out the paths between red nodes, use that to build something of a suffix tree, that shows red nodes that can be reached given the following path Like. The benefit here is that you avoid expanding a node if the path that expansion leads to red nodes that have already been visited twice. Thus this is only a useful check once at least 1 red node has been visited twice.
Use a parallel version of the algorithm. A single thread could be accessing the queue, and there is no interaction between elements in the queue. Though I suspect there are probably better ways. It may be possible to cut the runtime down to seconds instead of hundreds of seconds. Although that depends on the level of parallelization, and efficiency. You could also apply this to the previous algorithm. Actually the reasons for which I switched to using this algorithm are pretty much negated by
You could use a stack instead of a queue. The main benefit here is by using a depth-first approach, the size of the queue should remain fairly small.

Related

What's the fastest way to find if a point is in one of many rectangles?

So basically im doing this for my minecraft spigot plugin (java). I know there are already some land claim plugins but i would like to make my own.
For this claim plugin i'd like to know how to get if a point (minecraft block) is inside a region (rectangle). i know how to check if a point is inside a rectangle, the main problem is how to check as quickly as possible when there are like lets say 10.000 rectangles.
What would be the most efficient way to check 10.000 or even 100.000 without having to manually loop through all of them and check every single rectangle?
Is there a way to add a logical test when the rectangles get generated in a way that checks if they hold that point? In that case you could set a boolean to true if they contain that point when generated, and then when checking for that minecraft block the region (rectangle) replies with true or false.
This way you run the loops or checks when generating the rectangles, but when running the game the replies should happen very fast, just check if true or false for bool ContainsPoint.
If your rectangles are uniformly placed neighbors of each other in a big rectangle, then finding which rectangle contains point is easy:
width = (maxX-minX)/num_rectangles_x;
height = same but for y
idx = floor( (x - minX)/width );
idy = floor( (y - minY)/height );
id_square = idx + idy*num_rectangles_x;
If your rectangles are randomly placed, then you should use a spatial acceleration structure like octree. Then check if point is in root, then check if point is in one of its nodes, repeat until you find a leaf that includes the point. 10000 tests per 10milliseconds should be reachable on cpu. 1 million tests per 10ms should be ok for a gpu. But you may need to implement a sparse version of the octree and a space filling curve order for leaf nodes to have better caching, to reach those performance levels.

Is it possible for a red black tree to have a node being black, node's parent being black, but the parent has no siblings?

I'm reading through red black tree chapter from the book "introduction to algorithms" by Cormen. In that delete chapter, it basically says if both the node and node's parent are black, then the sibling of node's parent has to exist. The text is in page 327 first paragraph.
I don't get the logic for it, like how do we deduct something like that? Is that even true?
Though I did try to implement my own version of RB Tree and I couldn't produce a tree with a subtree like
/
R
/
B
/ \
B ...
/
...
or
/
B
/
B
/ \
B ...
/
...
Basically both node and parent are black but parent doesn't have sibling. Can't produce it so far
The text is correct. The only time you can have a black node with no sibling is when the node in question is the tree root. The easy way to think of this is to draw out the B-tree equivalent, Moving red nodes up so they are part of the same block as their parent black node. When you do this it becomes easy to see that a non-root black node with no sibling is going to produce an unbalanced tree.
In fact the two options for a non-root non-leaf black node are that either the sibling is black, or the sibling is red and has two black children (even if those two children are leaf nodes and simply represented by a null value in the implementation).
SoronelHaetir is right. But there are 3 things to say about fixups of deletes:
When "sibling" is talked about, this is normally the sibling of the current Node. It is not usually parent's sibling. So if the current Node is the left child of the parent then the "sibling" is the right child of the parent.
You say "I don't get the logic for it, like how do we deduct something like that? Is that even true?". Yes it is. One of the rules is the total black Node count down every path is the same.
So in a correct RB tree, for a given node, if the left child is black, then the right child is black (or it is red and it has 2 black children). For a correct RB Tree, you cannot have any node with an unbalanced count of black nodes down all paths. The count has to be the same in every path.
The other situation where you have unbalanced black tree count is where a leaf node has been deleted and it is black, and you are in the process of restoring the black tree count. Until it is restored (which can always be done), the tree is not a valid RB Tree.

Binary Search Tree Minimum Value

I am new to binary search tree data structure. One thing I don't understand is why the leftest node is the smallest
10
/ \
5 12
/ \ / \
1 6 0 14
In the above instance, 0 is the smallest value not 1.
Let me know where I got mixed up.
Thank you!
That tree is not binary search tree.
Creating a binary search tree is a process which starts with adding
elements.
You can do it with array.
First there is no element so make it root.Then start adding elements as node.If the new value is bigger than before add it array[2 x n + 1] (call index of the last value: n). If it is smaller than before add it to array[2 x n]. So all values left of any node is smaller than it and all values right of any node is bigger than it. Even 10 and 6's place.6 cannot be 11.(at your tree,it isn't actually.).That's all !
For a tree to be considered as a binary search tree, it must satisfy the following property:
... the key in each node must be greater than all keys stored in the left sub-tree, and smaller than all keys in right sub-tree
Source: https://en.wikipedia.org/wiki/Binary_search_tree
The tree you posted is not a binary search tree because the root node (10) is not smaller than all keys in the right sub-tree (node 0)
I'm not really sure of your question, but binary search works by comparing the search-value to the value of the node, starting with the root node (value 10 here). If the search-value is less, it then looks at the left node of the root (value 5), otherwise it looks next at the right node (12).
It doesn't matter so much where in the tree the value is as long as the less and greater rule is followed.
In fact, you want to have trees set up like this (except for the bad 0 node), because the more balanced a tree is (number of nodes on left vs. number of nodes on right), the faster your search will be!
A tree balancing algorithm might, for example, look for the median value in a list of values and make that the value of the root node.

Gain maximization on trees

Consider a tree in which each node is associated with a system state and contains a sequence of actions that are performed on the system.
The root is an empty node associated with the original state of the system. The state associated with a node n is obtained by applying the sequence of actions contained in n to the original system state.
The sequence of actions of a node n is obtained by queuing a new action to the parent's sequence of actions.
Moving from a node to another (i.e., adding a new action to the sequence of actions) produces a gain, which is attached to the edge connecting the two nodes.
Some "math":
each system state S is associated with a value U(S)
the gain achieved by a node n associated with the state S cannot be greater than U(S) and smaller than 0
If n and m are nodes in the tree and n is the parent of m, U(n) - U(m) = g(n,m), i.e., the gain on the edge between n and m represents the reduction of U from n to m
See the figure for an example.
My objective is the one of finding the path in the tree that guarantees the highest gain (where the gain of a path is computed by summing all the gains of the edges on the path):
Path* = arg max_{path} (sum g(n,m), for each adjacent n,m in path)
Notice that the tree is NOT known at the beginning, and thus a solution that does not require to visit the entire tree (discarding those paths that for sure do not bring to the optimal solution) to find the optimal solution would be the best option.
NOTE: I obtained an answer here and here for a similar problem in offline mode, i.e., when the graph was known. However, in this context the tree is not known and thus algorithms such as Bellman-Ford would perform no better than a brute-fore approach (as suggested). Instead, I would like to build something that resembles backtracking without building the entire tree to find the best solution (branch and bound?).
EDIT: U(S) becomes smaller and smaller as depth increases.
As you have noticed, a branch and bound can be used to solve your problem. Just expand the nodes that seem the most promising until you find complete solutions, while keeping track of the best known solution. If a node has a U(S) lower than the best known solution during the process, just skip it. When you have no more node, you are done.
Here is an algorithm :
pending_nodes <- (root)
best_solution <- nothing
while pending_nodes is not empty
Drop the node n from pending_nodes having the highest U(n) + gain(n)
if n is a leaf
if best_solution = nothing
best_solution <- n
else if gain( best_solution ) < gain( n )
best_solution <- n
end if
else
if best_solution ≠ nothing
if U(n) + gain(n) < gain(best_solution)
stop. best_solution is the best
end if
end if
append the children of n to pending_nodes
end if
end while

Check every value in a list in TI-Basic

I'm writing a snake game in TI-Basic, and every time I move I need to see if the head of the snake has hit any point in the tail. The tail is stored as a circular list-based queue, and I can add the beginning and end in constant time.
The only hard part is that I have to do something similar to this on every iteration:
(S = Size of the list)
For(I,1,S)
If X=LX(I) and Y=LY(I)
Then
Disp "GAME OVER"
Return
End
End
It's a fairly short loop, but it takes forever even on a list of 10 items. I tried the sequence way:
If sum(seq(X=LX(I) and Y=LY(I),I,1,S))
...
The only other optimization I can think of is to not check values for N to N+2 (because the first part of your tail that's possible to hit is at N+3), but that just puts off the problem after 4 points, and having the game unplayable with 14 points is no better than being unplayable after 10 points.
Using assembly isn't an option because I don't have a link cable (or the desire to write assembly).
Never used TI-Basic...
but how about also storing a 2D array of the game board. Each element in that array indicates if the snake is present. When you move forward, set the value of the array at the head point, and clear the value at the old tail end point. Then to test for collision, you can just do one lookup into the 2D array.
The entire block:
For(I,1,S)
If X=LX(I) and Y=LY(I)
Then
Disp "GAME OVER"
Return
End
End
can be replaced with:
If sum(X=LX and Y=LY)
Then
Disp "Game Over"
Return
End
X=LX applies the test piecewise to every element of LX, and the same goes for Y=LY. The sum() checks if there is a 1 in the intersection of the two lists.
What I did, when I was programming Snake, was to check if the pixel in front of the snake was on. If it was, I would check if this pixel is the "food" pixel, otherwise, the game would stop.
Example, with I and J being head and tail positions, (F, G) being the direction of the snake, and (M, N) being the food.
if Pxl-Test(I+F, J+G) #pixel in front of snake
then
if I+F=M and J+G=N
stop
end
Much more memory-conservant than a 2D array.