Is chaining rows in the same table a bad pattern? - sql

I want to create a tree structure of categories and need to find a proper way to store it into the database. Think of the following animal tree, which pretty accurately describes how it should look like:
My question now is whether chaining those entries within the same table is a good idea or not. SQLite doesn't allow me to add a FOREIGN KEY constraint to a value in the same table, so I have to make sure manually that I don't create inconsistencies. This is what I currently plan to have:
id | parent | name
---+--------+--------
1 | null | Animal
2 | 1 | Reptile
3 | 2 | Lizard
4 | 1 | Mammal
5 | 4 | Equine
6 | 4 | Bovine
parent references to an id in the same table, going up all the way until null is found, which is the root. Is this a bad pattern? And if so, what are common alternatives to put a tree structure into a relational database?

If your version of SQLite supports recursive CTE, then this is one option:
WITH RECURSIVE cte (n) AS (
SELECT id FROM yourTable WHERE parent IS NULL
UNION ALL
SELECT t1.id
FROM yourTable t1
INNER JOIN cte t2
ON t1.parent = t2.n AND t1.name NOT LIKE '%Lizard%'
)
SELECT *
FROM yourTable
WHERE id IN cte;
This is untested, but the check on t1.name in the recursive portion of the above CTE (hopefully) should stop the recursion as soon we reach a record which matches the name in the LIKE expression. In the case of searching for Lizard, the recursion should stop one level above Lizard, meaning that every record above it in the hierarchy should be returned.

Related

SQL query to select nodes with no parent

Suppose I've got a table node with two fields : id, and child_id :
id | name | child_id |
----------------------------
1 | node1 | NULL |
2 | node2 | 3 |
3 | node3 | NULL |
It means that "node1" has no parent and no children, "node2" has no parent and child "node2", and "node2" has parent "node2" and no children.
Now I want to select all "node" that have no parent.
In the example above I should get rows :
id | name | child_id |
----------------------------
1 | node1 | NULL |
2 | node2 | 3 |
How would you implement this query ?
A traditional anti-join will produce the rows you want. For example:
select c.*
from node c
left join node p on c.id = p.child_id
where p.id is null
I had to think about this, because it has the parent-child-relationships in opposite direction compared to traditional tree structures (where a node has a reference to its parent, not its child).
If you want all nodes without parents in your data structure, you want all nodes whose id is not used as a child id in other nodes. Right? (Because such other nodes would be the parents of that node.)
That could result in a query like this:
select *
from node
where id not in (select child_id
from node
where child_id is not null)
(Note that you have to be careful with three-valued logic in SQL, which could mess up a NOT IN clause. That's why I included an explicit WHERE-clause in the subquery so that that subquery only includes "valid" rows.)
By the way, the alternative query in the answer of The Impaler (which uses a join instead of a subquery) is quite fine as well. I'm not sure, but that query might even be (marginally) more efficient. However, I guess that a subquery expresses the intention of the main query more clearly, making it somewhat easier to read and to maintain. So I would personally stick with the subquery-query at first and move to the join-query only if that turns out to have a substantial and relevant performance benefit. But that's just my own humble opinion, of course. (And in such a case, I would keep the subquery-query as a comment in the code/script for documentation purposes.)

Do a specific query for each row of a table, one by one

Let's say I have a table:
| key1 | key2 | value |
+------+------+-------+
| 1 | 1 | 1337 |
| 1 | 2 | 6545 |
| 2 | 1 | 213 |
| 3 | 1 | 131 |
What I would like to do is traverse this table row by row, then using the key two values in further queries (all other tables contain the unique combination of these two keys + other data)
How do I do this kind of thing in SQL?
EDIT: I would want to extract key1, key2 from row 1 (1,1) then do a query on it, which would result in a number.
Then I would move to the second row, an identical query which would again result in a number.
All of these numbers would be then inserted into a pre-prepared view.
EDIT2: I need to traverse it because the specific use of my database.
It is a database of planets which contains sectors (the keys are the IDs of these two). All of these sectors contain resources, turrets and walls.
The table I have in my post is an example of table of sectors, with the value being enemy force.
Table of resources, turrets etc. contain these two keys so they are linked to and only to a specific sector.
I need to go row by row so I can use this keys to select only specific resources/turrets/walls from my tables, aggregate them and then subtract them from the value in my sector table. Resulting number would then be inserted into a pre-prepared view (again, into the row which matches the combination of my two keys)
This sounds like a correlated subquery or lateral join. You don't have that much explanation, but something like this:
select t1.*, t2.*
from table1 t1 cross join lateral
(select . . .
from table2 t2 . . .
where t2.key1 = t1.key1 and t2.key2 = t1.key2
) t2
You are not clear on what the second query looks like. The where clause is called a correlation clause. It connects the subquery to the outer query. A correlation clause is not strictly needed for this to work.
The columns from the outer query can be used elsewhere in the subquery. I am just assuming that an equality condition connects the two (lacking other information).

WITH RECURSIVE SELECT via secondary table

I'm having a bit of a hard time trying to piece this together. I'm not adept with databases or complex queries.
The Database
I'm using the latest MariaDB release.
I have a database table configuration like so, representing a hierarchical data structure:
|----------------------|
| fieldsets |
|----+-----------------|
| id | parent_field_id |
|----+-----------------|
| 1 | NULL |
| 2 | 1 |
|----------------------|
|-------------------------|
| fields |
|----+--------------------|
| id | parent_fieldset_id |
|----+--------------------|
| 1 | 1 |
| 2 | 1 |
|-------------------------|
The Problem
I'm trying to piece together a recursive query. I need to select every fieldset in a given hierarchy. For example, in the above, stripped-down example, I want to select fieldset of id = 1, and every descendant fieldset.
The IDs of the next rung down in any given level in the hierarchy are obtained only via columns of a secondary table.
The table fieldsets contains no column by which I can directly get all child fieldsets. I need to get all fields that are a child of a given fieldset, and then get any fieldsets that are a child of that field.
A Better Illustration of the Problem
This query does not work because of the reported error: "Restrictions imposed on recursive definitions are violated for table all_fieldsets"
However, it really illustrates what I need to do in order to get all descendant fieldsets in the hierarchy (remember, a fieldset does not contain the column for its parent fieldset, since a fieldset cannot have a fieldset as a direct parent. Instead, a fieldset has a parent_field_id which points to a row in the fields table, and that row in the fields table correspondingly has a column named parent_fieldset_id which points to a row back in the fieldsets table, which is considered the parent fieldset to a fieldset, just an indirect parent.
WITH RECURSIVE all_fieldsets AS (
SELECT fieldsets.* FROM fieldsets WHERE id = 125
UNION ALL
SELECT fieldsets.* FROM fieldsets
WHERE fieldsets.parent_field_id IN (
SELECT id FROM fields f
INNER JOIN all_fieldsets afs
WHERE f.parent_fieldset_id = afs.id
)
)
SELECT * FROM all_fieldsets
My Attempt
The query I have thus far (which does not work):
WITH RECURSIVE all_fieldsets AS (
SELECT fieldsets.* FROM fieldsets WHERE id = 125
UNION
SELECT fieldsets.* FROM fieldsets WHERE fieldsets.id IN (SELECT fs.id FROM fieldsets fs LEFT JOIN fields f ON f.id = fs.parent_field_id WHERE f.parent_fieldset_id = fieldsets.id)
)
SELECT * FROM all_fieldsets
My Research
I'm also having a hard time finding an example which fits my use-case. There's so many results for hierarchical structures that involve one table having only relations to itself, not via a secondary table, as in my case. It's difficult when you do not know the correct terms for certain concepts, and any layman explanation seems to yield too many tangential search results.
My Plea
I would be enormously grateful to all who can point out where I'm going wrong, and perhaps suggest the outline of a query that will work.
The main problem I see with your current code is that the recursive portion of the CTE (the query which appears after the union) is not selecting from the recursive CTE, when it should be. Consider this updated version:
WITH RECURSIVE all_fieldsets AS (
SELECT * FROM fieldsets WHERE id = 125
UNION ALL
SELECT f1.*
FROM fieldsets f1
INNER JOIN all_fieldsets f2
ON f1.parent_field_id = f2.id
)
SELECT *
FROM all_fieldsets;
Note that the join in the recursive portion of the CTE relates a given descendant record in fieldsets to its parent in the CTE.
I got home from work, and I just could not set this down!
But, out of that came a solution.
I highly recommend reading this answer about recursive queries to get a better idea of how they work, and what the syntax means. Quite brilliantly explained: How to select using WITH RECURSIVE clause
The Solution
WITH RECURSIVE all_fieldsets AS (
SELECT * FROM fieldsets fs
WHERE id = 59
UNION ALL
SELECT fs.* FROM fieldsets fs
INNER JOIN all_fieldsets afs
INNER JOIN fields f
ON f.parent_fieldset_id = afs.id
AND fs.parent_field_id = f.id
)
SELECT * FROM all_fieldsets
I had to use joins to get the information from the fields table, in order to get the next level in the hierarchy, and then do this recursively until there is an empty result in the recursive query.

Glueing together relational data rows

I have a sparse table structured like:
id | name | phone | account
There is no primary key or index
There are also null values. What I want is to "glue" data from different rows together, e.g.:
Given
id | name | phone | account
1 null '339-33-27' 4
null 'John' '339-33-27' 4
I want to end up with
id | name | phone | account |
1 'John' '339-33-27' 4
However, I don't know which values are missed in the table.
What are the general way to approach this kind of problem? Do I need to use only joins or might be recursive functions?
Update: Provided more clear example
id to account is many-to-many
account to name is many-to-many
phone to name is one-to-one
The database is basically raw transactional data
What I want to is to get all the rows for which I already have / could find an account
If I understand you correctly then this might work. What you need is a self join
select t2.id, t1.name, t1.phone, t1.account
from table1 t1
join table1 t2 on t1.account = t2.account and t1.phone = t2.phone
where t1.name is not null
However this particular query relies on an assumption from your example data. My assumption is that if name is not null, Id will be null and the Id can be found by looking at the phone number and account. If this assumption is not true , then we may need more sample data to solve your problem.
Depending on the data, you might need left joins or to swap so that T1 gets the id and not the name and the where condition is that ID is not null. It's hard to tell with such a small data sample size.

Remove rows NOT referenced by a foreign key

This is somewhat related to this question:
I have a table with a primary key, and I have several tables that reference that primary key (using foreign keys). I need to remove rows from that table, where the primary key isn't being referenced in any of those other tables (as well as a few other constraints).
For example:
Group
groupid | groupname
1 | 'group 1'
2 | 'group 3'
3 | 'group 2'
... | '...'
Table1
tableid | groupid | data
1 | 3 | ...
... | ... | ...
Table2
tableid | groupid | data
1 | 2 | ...
... | ... | ...
and so on. Some of the rows in Group aren't referenced in any of the tables, and I need to remove those rows. In addition to this, I need to know how to find all of the tables/rows that reference a given row in Group.
I know that I can just query every table and check the groupid's, but since they are foreign keys, I imagine that there is a better way of doing it.
This is using Postgresql 8.3 by the way.
DELETE
FROM group g
WHERE NOT EXISTS
(
SELECT NULL
FROM table1 t1
WHERE t1.groupid = g.groupid
UNION ALL
SELECT NULL
FROM table1 t2
WHERE t2.groupid = g.groupid
UNION ALL
…
)
At the heart of it, SQL servers don't maintain 2-way info for constraints, so your only option is to do what the server would do internally if you were to delete the row: check every other table.
If (and be damn sure first) your constraints are simple checks and don't carry any "on delete cascade" type statements, you can attempt to delete everything from your group table. Any row that does delete would thus have nothing reference it. Otherwise, you're stuck with Quassnoi's answer.