When working with a BST, can the left descendant of the right child of the root node be greater than the root node and visa-versa - binary-search-tree

When we are doing a BST, I understand that one major key point is that the left child must be less than the right child. Is it possible when we create a BST and have a root node, that as you traverse on the left side of that root node, and reach a right child of it, that right child is also greater than the root node?
Same thing if we were traversing on the right side of that root node. If we traverse on the right side of the root node, could we have a situation where we hit a left child that is less than the value of the root node?

When we are doing a BST, I understand that one major key point is that the left child must be less than the right child.
True, this is one of the things that follow from a BST structure.
Is it possible when we create a BST and have a root node, that as you traverse on the left side of that root node, and reach a right child of it, that right child is also greater than the root node?
No, this is not allowed by definition.
Wikipedia defines a BST as (I highlight in bold)
...a rooted binary tree data structure whose internal nodes each store a key greater than all the keys in the node’s left subtree and less than those in its right subtree.

Related

Can I convert this cursor and while loop to a set based solution?

I currently am writing a script where I have a tree as well a set of known parent nodes (none of which are the root node)and a set of known child nodes. For each child node, I have to find a direct descendant of one of the parent nodes that is also a parent of the child node. For each child node, only one such value exists, but there could be any number of nodes between each child node and its corresponding target.
What I have now is a cursor that iterates through each child node and uses a while loop to travel up the tree until it finds a node with a parent in the set of parent nodes, and that is the match. My question is, can I solve this without a cursor or the while loop in a set-based way? I'm not a sql expert, but could not come up with a way to do this using merges or joins.
When working with apparently difficult tree problems, it is often useful to build an "ancestors table". This isn't just a SQL thing, it's a common tool used when dealing with hierarchies.
An ancestors table contains all the connections between the various nodes. So if you have a graph with root A, B as a child of A, and C as a child of B, your ancestors table contains a row for the connection from B to A, and a row for the connection from C to B, and a row for the connection from C to A, and then optionally a "root" row (from A to A with a length of zero).
Once you have such a table most problems become a lot easier to formulate. For example, your problem would turn into a fairly straightforward set of joins to do the following:
Find the set of rows R1(parent, child, length) in Ancestors where R1.parent is a KnownParent and the path length is 1 (this gives you the direct descendants of KnownParents), and then find the set of rows R2(parent, child) in Ancestors where R2.parent = R1.child, and R2.child is a KnownChilld
Generating an ancestors table can be done with a recursive CTE, has mentioned by HABO. There's an existing stackoverflow answer about that here
An ancestors table isn't the only way to answer this question, but it's such a useful thing to learn I suggest using one. You don't have to persist the ancestors of course, just join directly to the output of the recursive cte.

deleting multiple node in BST changes the resulting tree?

If I need to delete multiple nodes in BST are the resulting trees different altering deletion order? The normal left-right order will be preserved, but I'm not sure about the tree structure.
This is a "phisolophical" question and I need a dimostration of it or a counter-example.
Deleting 1 and 2 in different order results in two different treesCounterexample:
the order of deleting 1 and 2 results in different BSTs.
Because when you want to delete 2 at first, you must find next element in its
right subtree,that is, 3 and replace it with 2, but when you delete 1 at first and then you want to delete 2, now it has only one child and simply its child replace it, that is,4.
So results in two different trees.
To prove that order of deletion has no effect on the final tree, it is necessary and sufficient to prove that any two deletion operations commute (that is, that they have the same effect if their order is reversed).
The effect of the deletion of a node is confined to that node and the subtree of which it is the root. So if two nodes are separate (i.e neither is under the other) then their deletions commute. So the only cases of interest have one node in the other's subtree.
Without loss of generality, suppose we use the rule that when we delete a node that has two children, we replace it with its successor. We'll call the higher one A and the lower one B. If B is in A's left subtree, then the deletions commute because the deletion of A has no effect on A's left subtree, and the deletion of B has no effect outside A's left subtree. So the only case of interest is when B is in A's right subtree.
When A is deleted, the effect on A's right subtree is the same as if A's successor had been deleted. Suppose B is not A's successor; we'll call A's successor C. The deletion of A consists of the deletion of C from the right subtree and the replacement of A with C (which commute), so if the deletions of B and C commute, then the deletion of A and B commute. By induction, if any pair of deletions do not commute, then the deletion of A and B where B is A's successor does not commute.
But the deletion of A and its successor do commute, by inspection. Q.E.D.

How can I store a binary space partitioning tree in a relational database?

I'm trying to store the data in a binary space partitioning tree in a relational database. The tricky part about this data structure is it has two different types of nodes. The first type, which we call a data node, simply holds a certain number of items. We define the maximum number of items able to be held as t. The second type, which we refer to as a container node, holds two other child nodes. When an item is added to the tree, the nodes are recursed until a data node is found. If the number of items in the data node are less than t, then the item is inserted into the data node. Otherwise the data node is split into two other data nodes, and is replaced by one of the container nodes. When an element is deleted, a reverse process must happen.
I'm a little bit lost. How am I supposed to make this work using a relational model?
Why not have two tables, one for nodes and one for items? (Note that I used the term "leaf" instead of "data" nodes below when I wrote my answer; a "leaf" node has data items, a non-"leaf" node contains other nodes.)
The node table would have columns like this: id primary key, parentid references node, leaf boolean and in addition some columns to describe the spatial bounaries of the node and how it will/has been split. (I don't know if you're working in 2D or 3D so I haven't given details on the geometry.)
The data table would have id primary key, leafid references node and whatever data.
You can traverse the tree downward by issuing SELECT * FROM node WHERE parentid = ?queries at each level and checking which child to descend into. Adding a data item to a leaf is a simple INSERT. Splitting a node requires unsetting the leaf flag, inserting two new leaf nodes, and updating all the data items in the node to point to the appropriate child node by changing their leafid values.
Note that SQL round trips can be expensive, so if you're looking to use this for a real application, consider using a relatively large t in the DB constructing a finer-grained tree in memory of the leaves you are interested in after you have the data items.

Adjacency list tree - how to prevent circular references?

I have an adjacency list in a database with ID and ParentID to represent a tree structure:
-a
--b
---c
-d
--e
Of course in a record the ParentID should never be the same as ID, but I also have to prevent circular references to prevent an endless loop. These circular references could in theory involve more than 2 records. ( a->b, b->c, c->a , etc.)
For each record I store the paths in a string column like this :
a a
b a/b
c a/b/c
d d
e d/e
My question is now :
when inserting/updating, is there a way to check if a circular reference would occur?
I should add that I know all about the nested set model, etc. I chose the adjacency method with stored path's because I find it much more intuitive. I got it working with triggers and a separate paths-table, and it works like a charm, except for the possible circular references.
If you're storing the path like that, you could put in a check that the path does not contain the id.
If you are using Oracle you can implement a check for cycles using the CONNECT BY syntax. The count of nodes should be equal to the count of decendents from the root node.
CHECK (
(SELECT COUNT(*) Nodes
FROM Tree) =
(SELECT COUNT(*) Decendents
FROM Tree
START WITH parent_node IS NULL -- Root Node
CONNECT BY parent_node = PRIOR child_node))
Note, you will still need other checks to enforce the tree. IE
Single root node with null.
Node can have exactly one parent.
You cannot create a check constraint with a subquery, so this will need to go to a view or trigger.

How to find all nodes in a subtree in a recursive SQL query?

I have a table which defines a child-parent relationship between nodes:
CREATE TABLE node ( ' pseudo code alert
id INTEGER PRIMARY KEY,
parentID INTEGER, ' should be a valid id.
)
If parentID always points to a valid existing node, then this will naturally define a tree structure.
If the parentID is NULL then we may assume that the node is a root node.
How would I:
Find all the nodes which are decendents of a given node?
Find all the nodes under a given node to a specific depth?
I would like to do each of these as a single SQL (I expect it would necessarily be recursive) or two mutually recursive queries.
I'm doing this in an ODBC context, so I can't rely on any vendor specific features.
Edit
No tables are written yet, so adding extra columns/tables is perfectly acceptable.
The tree will potentially be updated and added to quite often; auxillary data structures/tables/columns would be possible, though need to be kept up-to-date.
If you have any magic books you reach for for this kind of query, I'd like to know.
Many thanks.
This link provides a tutorial on both the Adjacency List Model (as described in the question), and the Nested Set Model. It is written as part of the documentation for MySQL.
What is not discussed in that article is insertion/delection time, and maintenance cost of the two approaches. For example:
a dynamically grown tree using the Nested Set Model would seem to need some maintenance to maintain the nesting (e.g. renumbering all left and right set numbers)
removal of a node in the adjacency list model would require updates in at least one other row.
If you have any magic books you reach for for this kind of query, I'd like to know.
Celko's Trees and Hierarchies in SQL For Smarties
Store the entire "path" from the root node's ID in a separate column, being sure to use a separator at the beginning and end as well. E.g. let's say 1 is the parent of 5, which is the parent of 17, and your separator character is dash, you would store the value -1-5-17- in your path column.
Now to find all children of 5 you can simply select records where the path includes -5-
The separators at the ends are necessary so you don't need to worry about ID's that are at the leftmost or rightmost end of the field when you use LIKE.
As for your depth issue, if you add a depth column to your table indicating the current nesting depth, this becomes easy as well. You look up your starting node's depth and then you add x to it where x is the number of levels deep you want to search, and you filter out records with greater depth than that.