Is it possible to select node ancestors by SQL2 query?
For example
I have: /content/categories/sport/football node
Want to select: /content, /content/categories, /content/categories/sport nodes
You can, but assuming you have other siblings at those levels it's not very easy or dynamic. Honestly, it'll probably be far easier and far more performant to just use the Node methods to walk up the ancestors. Remember that you can get the Node object(s) for each row in a JCR-SQL2 query result.
Alternatively, if you just want the paths to the ancestors, then you can implicitly get these from the path of a result node (e.g., /content/categories/sport nodes).
Related
I'm struggling with finding the best of handling tree problems where the input is given as an array/list of pairs.
For example a tree is given as input in the format:
[(1,3),(1,2),(2,5)(2,4),(5,8)]
Where the first value in a pair is the parent, and the second value in a pair is the child.
I'm used to being given the root in tree problems. How would one go about storing this for problems such as "Lowest Common Ancestor"?
It depends on which problem you need to solve. For the problem of finding the lowest common ancestor of two nodes, you'll benefit most from a structure where you can find the parent of a given node in constant time. If it is already given that the nodes are numbered from 1 to n (without gaps), then an array is a good structure, such that arr[child] == parent. If the identifiers for the nodes are not that predictable, then use a hashmap/dictionary, such that map.get(child) == parent.
What's an efficient way to find all nodes within N hops of a given node? My particular graph isn't highly connected, i.e. most nodes have only degree 2, so for example the following query returns only 27 nodes (as expected), but it takes about a minute of runtime and the CPU is pegged:
MATCH (a {id:"36380_A"})-[*1..20]-(b) RETURN a,b;
All the engine's time is spent in traversals, because if I just find that starting node by itself, the result returns instantly.
I really only want the set of unique nodes and relationships (for visualization), so I also tried adding DISTINCT to try to stop it from re-visiting nodes it's seen before, but I see no change in run time.
As you said, matching the start node alone is really fast and faster if your property is indexed.
However what you are trying to do now is matching the whole pattern in the graph.
Keep your idea of your fast starting point:
MATCH (a:Label {id:"1234-a"})
once you got it pass it to the rest of the query with WITH
WITH a
then match the relationships from your fast starting point :
MATCH (a)-[:Rel*1..20]->(b)
I am reading a book about removing a node from a binary search tree right now and the procedure described in the book seems unnecessarily complicated to me.
My question is specifically about removing a node that has both left and right subtree. In my opinion, node-to-remove should be replaced by the rightmost node in its left subtree or by its left node if its left subtree only has one node.
In case No.1, if we remove 40, it will be replaced by 30; in case No.2, if we remove 40, it will be replaced 35.
But in the book, it says the replacement should be found from node-to-remove's right subtree which could involve some complex manipulations.
Am I missing something here? Please point it out.
What you have pointed out is correct, the deleted node should be replace by either its in order successor which is the left most node in the right sub-tree or its in-order predecessor which is the right most node in the left sub-tree. This allows the tree to be traversed correctly. Most binary search tree data structures allow the deletion to be performed either way but in some cases special cases you might want to implement deletion such that the tree remains balanced.
More details and sample code is available on Wikipedia.
In case no.1 if you remove node 40 it will be replace by 50.
In case no.2 if you remove node 40 it will be replace by 50.
So basically when we delete any node that has 2 child then the removal should be as below.
We go the right child of the node, and then extreme left of that child.
Below figures shown some example, how to delete a node from binary search tree. This is also taken from one book, but it is clearly explained.
With the hierarchyid datatype in SQL Server 2008 and onward, would there be any benefit to trying to optimize the issuing of the next child of /1/1/8/ [ /1/1/8/x/ ] such that x is the closest non-negative whole number to 1 possible?
An easy solution seems to be to find the maximum assigned child value and getting the sibling to the right but it seems like you'd eventually exhaust this (in theory if not in practice) since you're never reclaiming any of the values and to my understanding, negatives and non-wholes consume more space.
EXAMPLE: If I've got a parent /1/1/8/ who has these children (and order of the children doesn't matter and reassignment of the values is ok):
/1/1/8/-400/
/1/1/8/1/
/1/1/8/4/
/1/1/8/40/
/1/1/8/18/
/1/1/8/9999999999/
wouldn't I want the next child to have /1/1/8/2/ ?
Here's the thing.
What you are saying will be "optimal" is not necessarily optimal.
When I am inserting values into a hierarchy, I generally do not care what the order is for the child nodes of a particular node.
If I do, that is why there are two parameters in GetDescendant.
If I want to prepend the node into the order(i.e make it first), I use a first parameter of NULL and a second parameter that is the lowest value of the other children.
If I want to append the node into the order (i.e. make it last), I use a first parameter of the maximum value of the other children and a second parameter of NULL.
If I want to insert between two other child nodes, I need both the one that will be before and the one that will be after the node I am inserting.
In any case, generally the values in the hierarchy field don't really matter, because you will order by a different field like Name or something.
Ergo, the most "efficient" method of adding things into a hierarchy is to either prepend or append, since finding the MIN or MAX hierarchy value is easy, and doing what you are describing requires several queries to find the first "hole" in the tree.
In other words, don't put a lot of meaning onto the string representation of a hierarchy unless you are using them for an application in which you are using the hierarchy value to sort by.
Even in that case, you probably don't want to fill in hierarchy values as you describe, and probably want to append to the end anyway.
Hope this helped.
Is there a simple way to select the root node of a subtree (PostgreSQL ltree) from a query which returns (potentially) several descendant nodes of that same subtree? I've implemented a rather verbose algorithm for achieving the task (~40 lines, indented and formatted), but it would be awesome if I could leverage the fact that ltree data are in fact trees and have an easily accessible root node. It is important to note that several, distinct subtree roots may be returned from a single query, so I cannot merely sort the data and grab the top result.
June 07, 2012: I have updated the query to my most recent version, which cuts the time complexity in half. It uses a self-anti-join (if you will) to remove all nodes from the subtree which have ancestors in the subtree.
Essentially, my algorithm works as follows:
WITH roots AS
(
/* Place any query here, which returns a field "ancestry" of type ltree */
)
SELECT roots.*
FROM roots
WHERE NOT EXISTS
(
SELECT 1
FROM roots AS ancestors
WHERE ancestors.ancestry #> roots.ancestry
AND ancestors.id <> roots.id
);
(for more details, please see my gist, here: https://gist.github.com/1507368)
Can't you just use the subpath() function?
SELECT
SUBPATH(ancestry, 0, 1)
FROM
some_table;