SQL loop. I want to iterate through a loop containing SELECT results - sql

From a table with column structure (parent, child) I need:
For a particular parent I need all children.
From the result of (1) I need the children's children too.
For example for parent=1:
parent|child parent|child parent|child
1 a a d b f
b e g

This gets you the information you say you want, I think. Two columns: child and grandchild (if any, or else NULL). Not sure if it's the schema you'd like, since you don't specify. You may add JOINs to increase the recursion depth.
select t1.child, t2.child
from T as t1 left join T as t2
on t1.child = t2.parent
where t1.parent = 1
This works on SQLite; I think it's quite standard. Regarding schema if this one doesn't serve you, hopefully it may give you ideas; or else please specify more.

Related

SQL get leaves of directed acyclic graph

I am quite new to SQL and have a rather basic question. Suppose I'm dealing with the following table structure:
CREATE TABLE nodes (
id INTEGER NOT NULL PRIMARY KEY,
parent INTEGER REFERENCES nodes(id)
);
If we hold an invariant that says, the parent of a node cannot be equivalent to any of its children, then by definition we will not have any loops in our graph. Now we are left with a disjoint directed acyclic graph.
The two questions I have then are:
If we cannot change the structure of the database: What select statement would I have to write to efficiently get all of the leaves in my database? I.e. the ids that don't have any children.
If we can change the structure of the tables: What could we change or add to make this select statement more efficient?
An example of output for the graph with five nodes whose parents where 3->2, 2->1, and 5->4 would output 3 and 5 because they are the only nodes that don't have children.
You can use NOT EXISTS and a correlated subquery that checks for node where the current not is the parent. For leafs no such record can exist.
SELECT *
FROM nodes n1
WHERE NOT EXISTS (SELECT *
FROM nodes n2
WHERE n2.parent = n1.id);
Another option is a left join joining possible children of a node. If there's a null for an id of the "children's side" of the join no child exists for the current node, it's a leaf.
SELECT *
FROM nodes n1
LEFT JOIN nodes n2
ON n2.parent = n1.id
WHERE n2.id IS NULL;
And, leaving denormalization away, I don't think there's much to change in the table's structure. Indexes could help though. One should be on id (but that's already the case because of the primary key constraint) and one on parent (but again such an index already exists because MySQL creates indexes for foreign key tuples).
For more complex graph queries, you may use Common Table Expressions (CTEs), standardized in SQL:99 and supported in MySQL since 8.0.1 (reference)
But as others pointed out, for the query you're interested in, a simple NOT EXISTS subquery or equivalent is enough. Yet another equivalent to those already posted would be using the EXCEPT set operation:
SELECT id FROM nodes
EXCEPT SELECT parent FROM nodes
I would do:
select *
from nodes
where id not in (select parent from nodes where parent is not null)

In SQL, how can I select all parents that have children?

Let's say I have two related tables parents and children with a one-to-many relationship (one parent to many children). Normally when I need to process the information on these tables together, I do a query such as the following (usually with a WHERE clause added in):
SELECT * FROM parents INNER JOIN children ON (parents.id = children.parent_id);
How can I select all parents that have at least one child without wasting time joining all of the children to their parents?
I was thinking of using some sort of OUTER JOIN but I am not sure exactly what to do with it.
(Note that I am asking this question generally, so don't give me an answer that is tied to a specific RDBMS implementation unless there is no general solution.)
As I put earlier in comments:
Solution with LEFT JOIN and GROUP BY:
SELECT p.parents.id FROM parents p
LEFT JOIN children c ON (p.parents.id = c.children.parent_id)
WHERE children.parent_id IS NOT NULL
GROUP BY p.parents_id
The same with DISTINCT:
SELECT DISTINCT p.parents.id FROM parents p
LEFT JOIN children c ON (p.parents.id = c.children.parent_id)
WHERE children.parent_id IS NOT NULL
It should work in most SQL dialects, though some require as when assigning table aliases.
The above is not tested. Hopefully I made no typos.
I think that the simplest solution that avoids a JOIN would be:
SELECT * FROM parents WHERE id IN (SELECT parent_id FROM children);
try this
select parent_id,(select count(1) from children where parent_id = x.parent_id)
from parent x where
(select count(1) from children where parent_id = x.parent_id) > 0

SELECT When child record doesn't exist

I have two tables parent and children. The parent.mopid and children.mopid is the connection between the two tables. How would I write a SELECT that would end result show me just the parent records where there are no children records?
Use the NOT IN function:
SELECT * from parent
where parent.mopid NOT IN (SELECT mopid from children)
This will return all rows from the parent table that do not have a corresponding mopid in the childrens table.
If you have a lot of rows, a LEFT JOIN is often quicker than a NOT IN. But not always - it depends on the data so please try this answer and the one from #aktrazer and see which works best for you.
SELECT parent.*
FROM parent
LEFT JOIN children ON parent.mopid = children.mopid
WHERE children.mopid IS NULL
If there isn't a children row for the mopid, parent.mopid will have a value but child.mopid will be null.
SELECT * from parent p where NOT EXISTS
( select mopid from children c where p.mopid = c.mopid)
This should take care of the nulls as well
This link will explain you the difference between NOT IN and NOT EXISTS
NOT IN vs NOT EXISTS

How to get a related row if one (another) row exists?

I'm aware that this question's title might be a little bit inaccurate but I couldn't come up with anything better. Sorry.
I have to fetch 2 different fields, one is always there, the other isn't. That means I'm looking at a LEFT JOIN. Good so far.
But the row I want shown is not the row whose existence is uncertain.
I would like to do something like:
Show name and picture, but only show the picture if that name has a picture_id. Otherwise show nothing for the picture, but I still want the names regardless(left join).
I know this might be a little confusing but there's some clever guys out here so I guess somebody will understand it.
I tried some approaches but I couldn't quite say what I want in SQL.
P.S.: solutions specific to Oracle are good too.
------------------------------------------------------------------------------------------------------------------------------------
EDIT I've tried some queries but the main problem I found is that, inside the ON clause, I am only able to reference the last table mentioned, in other words:
There are four tables from which I'm retrieving data, but I can only mention the last (third table) inside the on clause of the LEFT JOIN(which is the 4th table). I'll describe the tables hopefully that'll help. Try not to delve too much on the names, because they are in Portuguese:
There are 4 tables. The fields I want to retrieve are :TB395.dsclaudo and TB397.dscrecomendacao, for a given TB392.nronip. The tables are as follows:
TB392(laudoid,nronip,codlaudo) // laudoid is PK, references TB395
TB395(codlaudo,dsclaudo) //codlaudo is PK
TB398(laudoid,codrecomendacao) //the pair laudoid,codrecomendacao is PK , references TB397
TB397(codrecomendacao,dscrecomendacao) // codrecomendacao is PK
Fields with the same name are foreign keys.
The problem is that there's no guarantee that, for a given laudoid,there will be one codrecomendacao. But, if there is, I want the dscrecomendacao field returned, that's what I don't know how to do. But even if there isn't a corresponding codrecomendacao for the laudoid, I still want the dsclaudo field, that's why I think a LEFT JOIN applies.
Sounds like you want your primary row source to be the join of TB392 and TB395; then you want an outer join to TB398, and when that gets a match, you want to lookup the corresponding value in TB397.
I would suggest coding the primary join as one inline view; the join between the two extra tables as a second inline view; and then doing an outer join between them. Something like:
SELECT ... FROM
(SELECT ... FROM TB392 JOIN TB395 ON ...) join1
LEFT JOIN
(SELECT ... FROM TB398 JOIN TB397 ON ...) join2
ON ...
It would be nice if you could specify what your tables are, which columns are on which tables, and what columns they join on. Its not clear if you have two tables or only one. I guess you have two tables because you are talking about a LEFT JOIN, and seem to imply that the join is on the name column. So you can use the NVL2 function to accomplish waht you want. So guessing what I can from your question, maybe something like:
SELECT T1.name
, NVL2( T2.picture_id, T1.picture, NULL )
FROM table1 T1
LEFT JOIN
table2 T2
ON T1.name = T2.name
If you only have one table, then its even simpler
SELECT T1.name
, NVL2( T1.picture_id, T1.picture, NULL )
FROM table1 T1
I think you need:
SELECT ...
FROM
TB395
JOIN
TB392
ON ...
LEFT JOIN --- this should be a LEFT JOIN
TB398
ON ...
LEFT JOIN --- and this as well, so the previous is not cancelled
TB397
ON ...
The details may be not accurate:
SELECT
a.dsclaudo
, b.laudoid
, c.codrecomendacao
, d.dscrecomendacao
FROM
TB395 a
JOIN
TB392 b
ON b.codlaudo = a.codlaudo
LEFT JOIN
TB398 c
ON c.laudoid = b.laudoid
LEFT JOIN
TB397 d
ON d.codrecomendacao = c.codrecomendacao
Create two views and then do your left join on the views. For example:
Create View view392_395
as
SELECT
t1.laudoid,
t1.nronip,
t1.codlaudo,
t2.dsclaudo
FROM TB392 t1
INNER JOIN TB395 t2
ON t1.codlaudo
= t2.codlaudo
Create View view398_397
as
SELECT
t1.laudoid,
t1.codrecomendacao,
t2.dscrecomendacao
FROM TB398 t1
INNER JOIN TB397 t2
ON t1.codrecomendacao
= t2.codrecomendacao
SELECT
v1.laudoid,
v1.nronip,
v1.codlaudo,
v1.dsclaudo,
v2.codrecomendacao,
v2.dscrecomendacao
FROM view392_395 v1
LEFT OUTER JOIN view398_397 v2
ON v1.laudoid
= v2.laudoid
In my opinion, views are always under used. Views are your friend. They can simplify some of the most complicated queries.

How many SQL queries do I need?

I have a database table (sqlite) containing items that form a tree hierarchy. Each item has an id field (for itself) and a parentId for its parent. Now given an item, I must retrieve the whole chain from the root to the item.
Basically the algorithm in pseudocode looks like:
cursor is item
retrieve parentItem for cursor by parentId
if parentItem is not rootItem, then cursor = parentItem and goto 2.
So I have to perform an SQL SELECT query for each item.
Is it possible to retrieve the whole chain rootItem -> ... -> item by performing only one SQL query?
There are lots of creative ways of organizing hierarchial data in a database, but consistently I find it easiest bring back the data in non-hierarchial format, then match up parent and child records programmatically.
Total amount of effort: 1 query + 1 programmatic pass through your dataset to create the hierarchy.
Alternative approach:
I've used this method in the past with limited success. You can store the path of each item in your tree using a varchar(max) column as follows:
ID ParentID Path
-- -------- ----
1 null 1/
2 1 1/2/
3 null 3/
4 2 1/2/4/
5 4 1/2/4/5/
6 null 6/
7 5 1/2/4/5/7/
9 5 1/2/4/5/9/
From that point, getting all of the nodes under ID = 5 is a very simple:
SELECT *
FROM table
WHERE Path like (SELECT Path FROM Table WHERE ID = 5) + '%'
Not with ANSI standard SQL it isn't, no. Well, that's not strictly true. You can do left outer joins and put in enough to cover the likely maximum depth but unless you restrain the max depth and include that many joins, it won't always work.
If your set of rows is sufficiently small (say less than 1000), just retrieve them all and then figure it out. It'll be faster than single read traversals in all likelihood.
You could batch the parent traversal. Have a query like:
SELECT t1.id id1, t1.parent parent1,
t2.id id2, t2.parent parent2,
t3.id id3, t3.parent parent3,
t4.id id4, t4.parent parent4,
t5.id id5, t5.parent parent5
FROM mytable t1
LEFT OUTER JOIN mytable t2 ON t1.parent = t2.id
LEFT OUTER JOIN mytable t3 ON t2.parent = t3.id
LEFT OUTER JOIN mytable t4 ON t3.parent = t4.id
LEFT OUTER JOIN mytable t5 ON t4.parent = t5.id
WHERE t1.id = 1234
and extend it to whatever number you want. If the last retrieved parent isn't null you aren't at the top of the tree yet so run the query again. This way you should hopefully reduce it to 1-2 roundtrips.
Other than that you could look at ways of encoding that data in the ID. This isn't recommended but if you limit, say, each node to having 100 children you could say that node with an ID 10030711 has path of 10 -> 03 -> 07 -> 11. That of course has other problems (like max ID length) and of course it's hacky.
It's also worth noting that there are two basic models for hierarchical data in SQL. Adjacency lists and nested sets. Your way (which is pretty common) is an adjacency set. Nested sets wouldn't really help with this situation though and they are complicated to do inserts on.
are you able to change the table structure? Looks like storing left and right nodes would be easier to work with than storing just a parent because then a single select is possible. See the following links:
http://www.mail-archive.com/sqlite-users#sqlite.org/msg23867.html
http://weblogs.asp.net/aghausman/archive/2009/03/16/storing-retrieving-hierarchical-data-in-sql-server-database.aspx (this is SQLServer, but they have a diagram that might help.)