Retrieving parent value from mysql self join - sql

I have a table with fields:
id
name
parent_id
grandparent_id
I want to select the id, name, parent name, grandparent name.
Can I do this with a self join? I basically want to retrieve the "name" value where parent_id = id and return one row.
Example:
id name parent_id grandparent_id
-----------------------------------------------
1 Milton NULL NULL
2 Year 3 1 NULL
3 Class A 2 1
So i want to select the 3rd row (id = 3) but instead of returning just the parent_id and grandparent_id, i want the query to return the names of these records based on their ids. Can i create a composite field, say called parent_id_name and grandparent_id_name?
I'm pretty sure what i am doing can be achieved by a self join or sub query, but all of the code i have tried so far has failed to work. Any help would be really appreciated.

This is the query that you asked for:
# By using LEFT JOINs you will be able to read any record,
# even one with missing parent/grand-parent...
SELECT
child.id,
child.name,
parent.id,
parent.name,
gparent.id,
gparent.name
FROM
some_table child
LEFT JOIN some_table parent ON
parent.id = child.parent_id
LEFT JOIN some_table gparent ON
gparent.id = child.grandparent_id
WHERE
child.id = 3
BUT I would also add that the redundancy of having a field grandparent_id does NOT sound right to me...
Your table should be just:
id name parent_id
1 Milton NULL
2 Year 3 1
3 Class A 2
Notice that, if I know that 1 is the parent of 2, I don't need repeat that same information again on record 3...
In this last case, your select could be like this:
SELECT
child.id,
child.name,
parent.id,
parent.name,
gparent.id,
gparent.name
FROM
some_table child
LEFT JOIN some_table parent ON
parent.id = child.parent_id
LEFT JOIN some_table gparent ON
gparent.id = parent.parent_id -- See the difference?
WHERE
child.id = 3
The query would work the same, and you would also have more "normalized" database.
Edit: This is pretty basic stuff, but I guess it is relevant to this answer...
This kind of denormalization (i.e. to have both parent_id and grandparent_id on the same record) should not be used because it allows the database to be inconsistent.
For instance, let's suppose that a new record is inserted:
id name parent_id grandparent_id
1 Milton NULL NULL
2 Year 3 1 NULL
3 Class A 2 1
4 Invalid Rec 2 3
It doesn't make any sense, right? Record 4 is stating that 3 is its grandparent. So, 3 should be the parent of record 2. But that's not what is stated on record 3 itself. Which record is right?
You may think this is an odd error, and that your database will never become like this. But my experience says otherwise - if an error may happen, it will eventually. Denormalization should be avoided, not just because some database guru says so, but because it really increases inconsistencies, and makes maintenance harder.
Of course, denormalized databases may be faster. But, as a rule of thumb, you should think about performance after your system is ready for production, and after you have perceived, by the means of some automated or empirical test, that a bottleneck exists. Believe me, I have seen much worse design choices being justified by wrong performance expectations before...

SELECT t1.name,
MAX(CASE WHEN t2.id = t1.parent_id then t2.name end) as Parent,
MAX(CASE WHEN t2.id = t1.grandparent_id then t2.name end) as GrandParent
FROM your_table t1
LEFT OUTER JOIN your_table t2 ON t2.id IN (t1.parent_id, t1.grandparent_id)
WHERE t1.id = 3
group by t1.id, t1.name

Try like this, this is for single parent
SELECT e.entity_name AS 'entity',
m.entity_name AS 'parent'
FROM table_name AS e
LEFT OUTER JOIN table_name AS m ON e.entity_parent =m.entity_id
or
check the below link :
http://databases.about.com/od/sql/a/selfjoins.htm

Related

How to replace/connect two values from the same table when they report as null?

I have a table which has the following format with multiple values;
s_sym
parent
r_id
aaaa.BW
aaa
NULL
aaaa
NULL
12345
I have another table which connects to this one through the same r_id.
r_id
gross
date
12345
12586
1/1/01
The r_id only has values for the parent column with the s_sym column returning as null when trying to join the columns together. Is there a way to connect the s_sym to the r_id so I can connect the two tables directly together so the results of the query appear something like this;
r_id
s_sym
parent
gross
12345
aaaa.BW
aaaa
12586
instead of what currently appears
r_id
s_sym
parent
gross
NULL
aaaa.BW
aaaa
NULL
Thanks in advance.
It seems pretty simple to me -
SELECT TEMP.r_id, TEMP.s_sym, TEMP.parent, T2.gross
FROM (SELECT T2.s_sym,
COALESCE(T1.parent, T2.parent) parent,
COALESCE(T1.r_id, T2.r_id) r_id
FROM TABLE1 T1
JOIN TABLE1 T2 ON T1.s_sym = SUBSTR(T1.s_sym, 1, INSTR(T2.s_sym, '.') - 1)) TEMP
JOIN TABLE2 T2 ON TEMP.r_id = T2.r_id
Demo.
It's far from clear from your question but presumably you need to join your rows on the first part (ie, before the ".") of the s_sym column.
We have no idea of the data types, the rules, or even the database you are using (you should tag the database in your question) however, the following produces your desired output, is this what you are after?
select t1.r_id, t1s.s_sym, t1.s_sym Parent, t2.gross
from t1 join t2 on t2.r_id=t1.r_id
join t1 t1s on t1s.s_sym like Concat(t1.s_sym,'%')
where t1s.parent is not null

Will this left join on same table ever return data?

In SQL Server, on a re-engineering project, I'm walking through some old sprocs, and I've come across this bit. I've hopefully captured the essence in this example:
Example Table
SELECT * FROM People
Id | Name
-------------------------
1 | Bob Slydell
2 | Jim Halpert
3 | Pamela Landy
4 | Bob Wiley
5 | Jim Hawkins
Example Query
SELECT a.*
FROM (
SELECT DISTINCT Id, Name
FROM People
WHERE Id > 3
) a
LEFT JOIN People b
ON a.Name = b.Name
WHERE b.Name IS NULL
Please disregard formatting, style, and query efficiency issues here. This example is merely an attempt to capture the exact essence of the real query I'm working with.
After looking over the real, more complex version of the query, I burned it down to this above, and I cannot for the life of me see how it would ever return any data. The LEFT JOIN should always exclude everything that was just selected because of the b.Name IS NULL check, right? (and it being the same table). If a row from People was found where b.Name IS NULL evals to true, then shouldn't that mean that data found in People a was never found? (impossible?)
Just to be very clear, I'm not looking for a "solution". The code is what it is. I'm merely trying to understand its behavior for the purpose of re-engineering it.
If this code indeed never returns results, then I'll conclude it was written incorrectly and use that knowledge during the re-engineering.
If there is a valid data scenario where it would/could return results, then that will be news to me and I'll have to go back to the books on SQL Joins! #DrivenCrazy
Yes. There are circumstances where this query will retrieve rows.
The query
SELECT a.*
FROM (
SELECT DISTINCT Id, PName
FROM People
WHERE Id > 3
) a
LEFT JOIN People b
ON a.PName = b.PName
WHERE b.PName IS NULL;
is roughly (maybe even exactly) equivalent to...
select distinct Id, PName
from People
where Id > 3 and PName is null;
Why?
Tested it using this code (mysql).
create table People (Id int, PName varchar(50));
insert into People (Id, Pname)
values (1, 'Bob Slydell'),
(2, 'Jim Halpert'),
(3,'Pamela Landy'),
(4,'Bob Wiley'),
(5,'Jim Hawkins');
insert into People (Id, PName) values (6,null);
Now run the query. You get
6, Null
I don't know if your schema allows null Name.
What value can P.Name have such that a.PName = b.PName finds no match and b.PName is Null?
Well it's written right there. b.PName is Null.
Can we prove that there is no other case where a row is returned?
Suppose there is a value for (Id,PName) such that PName is not null and a row is returned.
In order to satisfy the condition...
where b.PName is null
...such a value must include a PName that does not match any PName in the People table.
All a.PName and all b.PName values are drawn from People.PName ...
So a.PName may not match itself.
The only scalar value in SQL that does not equal itself is Null.
Therefore if there are no rows with Null PName this query will not return a row.
That's my proposed casual proof.
This is very confusing code. So #DrivenCrazy is appropriate.
The meaning of the query is exactly "return people with id > 3 and a null as name", i.e. it may return data but only if there are null-values in the name:
SELECT DISTINCT Id, PName
FROM People
WHERE Id > 3 and PName is null
The proof for this is rather simple, if we consider the meaning of the left join condition ... LEFT JOIN People b ON a.PName = b.PName together with the (overall) condition where p.pname is null:
Generally, a condition where PName = PName is true if and only if PName is not null, and it has exactly the same meaning as where PName is not null. Hence, the left join will match only tuples where pname is not null, but any matching row will subsequently be filtered out by the overall condition where pname is null.
Hence, the left join cannot introduce any new rows in the query, and it cannot reduce the set of rows of the left hand side (as a left join never does). So the left join is superfluous, and the only effective condition is where PName is null.
LEFT JOIN ON returns the rows that INNER JOIN ON returns plus unmatched rows of the left table extended by NULL for the right table columns. If the ON condition does not allow a matched row to have NULL in some column (like b.NAME here being equal to something) then the only NULLs in that column in the result are from unmatched left hand rows. So keeping rows with NULL for that column as the result gives exactly the rows unmatched by the INNER JOIN ON. (This is an idiom. In some cases it can also be expressed via NOT IN or EXCEPT.)
In your case the left table has distinct People rows with a.Id > 3 and the right table has all People rows. So the only a rows unmatched in a.Name = b.Name are those where a.Name IS NULL. So the WHERE returns those rows extended by NULLs.
SELECT * FROM
(SELECT DISTINCT * FROM People WHERE Id > 3 AND Name IS NULL) a
LEFT JOIN People b ON 1=0;
But then you SELECT a.*. So the entire query is just
SELECT DISTINCT * FROM People WHERE Id > 3 AND Name IS NULL;
sure.left join will return data even if the join is done on the same table.
according to your query
"SELECT a.*
FROM (
SELECT DISTINCT Id, Name
FROM People
WHERE Id > 3
) a
LEFT JOIN People b
ON a.Name = b.Name
WHERE b.Name IS NULL"
it returns null because of the final filtering "b.Name IS NULL".without that filtering it will return 2 records with id > 3

How do I do an SQL query based on a foreign key field?

I have the following tables:
people:
id, name
parent:
id, people_id, name
I have tried the following:
SELECT * FROM people
LEFT JOIN parent ON people.id = parent.people_id
WHERE parent.name != 'Carol';
How do I find all the people whose parent's name is not Carol?
You can try below code
select people.name from people
inner join parent on people.id=parent.people_id
where parent.name not in ('Carol')
If the two tables are to be queried by using Foreign Key.
If you want to get all records from one table that have some related entry in a second table then use Inner join
Select * from People INNER JOIN parent ON people.id = parent.people_id
WHERE parent.name <> 'Carol'
Similarly LEFT JOIN will get all records from the LEFT linked table but if you have selected some columns from the RIGHT table, if there is no related records, these columns will contain NULL
First of all, why would you need two tables? why can't you have a single table named "Person" with ID,Name,ParentID columns
Where ParentID will be optional and reference the ID if it has got parent.
And run the following query
select * from PERSON where Name not like 'Carol%' and ParentID IS NOT NULL;
SELECT * FROM people WHERE EXISTS(SELECT 1 FROM parent WHERE people_id = id AND name <> 'Carol')
First of all the table structure you have taken restrict the future growth. Like in future if you want to add parents of your parents then it wont work in this table structure.
You can do like :
id | parent_id | people_name
Here you can make parent_id null for the parent and add parent_id as id for those who have parent. Here to retrieve you have to use SELF join(join in the same table)
`
Select * from people P
INNER JOIN parent PA ON PA.people_id = P._id
where PA.name not in ('Carol')
`
Difference between INNER JOIN and LEFT OUTER JOIN
is
1) INNER JOIN bring only similar data between two table
for ex if in people table parent_id table is nullable then it will not discard the complete row,but in case of LEFT OUTER JOIN it will bring all the rows from LEFT table as well as related table from right table.with all null in right joined row..

SQL Server: querying hierarchical and referenced data

I'm working on an asset database that has a hierarchy. Also, there is a "ReferenceAsset" table, that effectively points back to an asset. The Reference Asset basically functions as an override, but it is selected as if it were a unique, new asset. One of the overrides that gets set, is the parent_id.
Columns that are relevant to selecting the heirarchy:
Asset: id (primary), parent_id
Asset Reference: id (primary), asset_id (foreignkey->Asset), parent_id (always an Asset)
---EDITED 5/27----
Sample Relevent Table Data (after joins):
id | asset_id | name | parent_id | milestone | type
3 3 suit null march shape
4 4 suit_banker 3 april texture
5 5 tie null march shape
6 6 tie_red 5 march texture
7 7 tie_diamond 5 june texture
-5 6 tie_red 4 march texture
the id < 0 (like the last row) signify assets that are referenced. Referenced assets have a few columns that are overidden (in this case, only parent_id is important).
The expectation is that if I select all assets from april, I should do a secondary select to get the entire tree branches of the matching query:
so initially the query match would result in:
4 4 suit_banker 3 april texture
Then after the CTE, we get the complete hierarchy and our result should be this (so far this is working)
3 3 suit null march shape
4 4 suit_banker 3 april texture
-5 6 tie_red 4 march texture
and you see, the parent of id:-5 is there, but what is missing, that is needed, is the referenced asset, and the parent of the referenced asset:
5 5 tie null march shape
6 6 tie_red 5 march texture
Currently my solution works for this, but it is limited to only a single depth of references (and I feel the implementation is quite ugly).
---Edited----
Here is my primary Selection Function. This should better demonstrate where the real complication lies: the AssetReference.
Select A.id as id, A.id as asset_id, A.name,A.parent_id as parent_id, A.subPath, T.name as typeName, A2.name as parent_name, B.name as batchName,
L.name as locationName,AO.owner_name as ownerName, T.id as typeID,
M.name as milestoneName, A.deleted as bDeleted, 0 as reference, W.phase_name, W.status_name
FROM Asset as A Inner Join Type as T on A.type_id = T.id
Inner Join Batch as B on A.batch_id = B.id
Left Join Location L on A.location_id = L.id
Left Join Asset A2 on A.parent_id = A2.id
Left Join AssetOwner AO on A.owner_id = AO.owner_id
Left Join Milestone M on A.milestone_id = M.milestone_id
Left Join Workflow as W on W.asset_id = A.id
where A.deleted <= #showDeleted
UNION
Select -1*AR.id as id, AR.asset_id as asset_id, A.name, AR.parent_id as parent_id, A.subPath, T.name as typeName, A2.name as parent_name, B.name as batchName,
L.name as locationName,AO.owner_name as ownerName, T.id as typeID,
M.name as milestoneName, A.deleted as bDeleted, 1 as reference, NULL as phase_name, NULL as status_name
FROM Asset as A Inner Join Type as T on A.type_id = T.id
Inner Join Batch as B on A.batch_id = B.id
Left Join Location L on A.location_id = L.id
Left Join Asset A2 on AR.parent_id = A2.id
Left Join AssetOwner AO on A.owner_id = AO.owner_id
Left Join Milestone M on A.milestone_id = M.milestone_id
Inner Join AssetReference AR on AR.asset_id = A.id
where A.deleted <= #showDeleted
I have a stored procedure that takes a temp table (#temp) and finds all the elements of the hierarchy. The strategy I employed was this:
Select the entire system heirarchy into a temp table (#treeIDs) represented by a comma separated list of each entire tree branch
Get entire heirarchy of assets matching query (from #temp)
Get all reference assets pointed to by Assets from heirarchy
Parse the heirarchy of all reference assets
This works for now because reference assets are always the last item on a branch, but if they weren't, i think i would be in trouble. I feel like i need some better form of recursion.
Here is my current code, which is working, but i am not proud of it, and I know it is not robust (because it only works if the references are at the bottom):
Step 1. build the entire hierarchy
;WITH Recursive_CTE AS (
SELECT Cast(id as varchar(100)) as Hierarchy, parent_id, id
FROM #assetIDs
Where parent_id is Null
UNION ALL
SELECT
CAST(parent.Hierarchy + ',' + CAST(t.id as varchar(100)) as varchar(100)) as Hierarchy, t.parent_id, t.id
FROM Recursive_CTE parent
INNER JOIN #assetIDs t ON t.parent_id = parent.id
)
Select Distinct h.id, Hierarchy as idList into #treeIDs
FROM ( Select Hierarchy, id FROM Recursive_CTE ) parent
CROSS APPLY dbo.SplitIDs(Hierarchy) as h
Step 2. Select the branches of all assets that match the query
Select DISTINCT L.id into #RelativeIDs FROM #treeIDs
CROSS APPLY dbo.SplitIDs(idList) as L
WHERE #treeIDs.id in (Select id FROM #temp)
Step 3. Get all Reference Assets in the branches
(Reference assets have negative id values, hence the id < 0 part)
Select asset_id INTO #REFLinks FROM #AllAssets WHERE id in
(Select #AllAssets.asset_id FROM #AllAssets Inner Join #RelativeIDs
on #AllAssets.id = #RelativeIDs.id Where #RelativeIDs.id < 0)
Step 4. Get the branches of anything found in step 3
Select DISTINCT L.id into #extraRelativeIDs FROM #treeIDs
CROSS APPLY dbo.SplitIDs(idList) as L
WHERE
exists (Select #REFLinks.asset_id FROM #REFLinks WHERE #REFLinks.asset_id = #treeIDs.id)
and Not Exists (select id FROM #RelativeIDs Where id = #treeIDs.id)
I've tried to just show the relevant code. I am super grateful to anyone who can help me find a better solution!
--getting all of the children of a root node ( could be > 1 ) and it would require revising the query a bit
DECLARE #AssetID int = (select AssetId from Asset where AssetID is null);
--algorithm is relational recursion
--gets the top level in hierarchy we want. The hierarchy column
--will show the row's place in the hierarchy from this query only
--not in the overall reality of the row's place in the table
WITH Hierarchy(Asset_ID, AssetID, Levelcode, Asset_hierarchy)
AS
(
SELECT AssetID, Asset_ID,
1 as levelcode, CAST(Assetid as varchar(max)) as Asset_hierarchy
FROM Asset
WHERE AssetID=#AssetID
UNION ALL
--joins back to the CTE to recursively retrieve the rows
--note that treelevel is incremented on each iteration
SELECT A.Parent_ID, B.AssetID,
Levelcode + 1 as LevelCode,
A.assetID + '\' + cast(A.Asset_id as varchar(20)) as Asset_Hierarchy
FROM Asset AS a
INNER JOIN dbo.Batch AS Hierarchy
--use to get children, since the parentId of the child will be set the value
--of the current row
on a.assetId= b.assetID
--use to get parents, since the parent of the Asset_Hierarchy row will be the asset,
--not the parent.
on Asset.AssetId= Asset_Hierarchy.parentID
SELECT a.Assetid,a.name,
Asset_Hierarchy.LevelCode, Asset_Hierarchy.hierarchy
FROM Asset AS a
INNER JOIN Asset_Hierarchy
ON A.AssetID= Asset_Hierarchy.AssetID
ORDER BY Hierarchy ;
--return results from the CTE, joining to the Asset data to get the asset name
---that is the structure you will want. I would need a little more clarification of your table structure
It would help to know your underlying table structure. There are two approaches which should work depending on your environment: SQL understands XML so you could have your SQL as an xml structure or simply have a single table with each row item having a unique primary key id and a parentid. id is the fk for the parentid. The data for the node are just standard columns. You can use a cte or a function powering a calculated column to determin the degree of nesting for each node. The limit is that a node can only have one parent.

SQL Server EXISTS query to determine relationship

I have the following tables:
Foo
FooId INT PRIMARY KEY
FooRelationship
FooRelationshipId INT PRIMARY KEY IDENTITY
FooParentId INT FK
FooChildId INT FK
How would I write a query that would return every id from Foo and the status the record (whether it is a parent, a child or neither).
Rules:
A foo will only be a parent or a child or neither.
A foo can be the parent of multiple different foos.
A foo can not be the child of more than one foo.
I originally wrote this query:
SELECT
FooId,
CASE
WHEN Parent.FooRelationshipId IS NOT NULL THEN 'Parent'
WHEN Child.FooRelationshipId IS NOT NULL THEN 'Child'
ELSE 'Neither'
END
FROM Foo F
LEFT JOIN FooRelationship Parent ON F.FooId = Parent.FooParentId
LEFT JOIN FooRelationship Child ON F.Fooid = Child.FooParentId
This is broken because if a Foo is a parent to two other Foos then it returns that id twice.
How can I rewrite this to either not use a join or use an EXISTS or something.
Just use DISTINCT - this is a good use case for it. You can't use EXISTS since you actually need to pull the data from both tables:
SELECT DISTINCT
FooId,
CASE
WHEN Parent.FooRelationshipId IS NOT NULL THEN 'Parent'
WHEN Child.FooRelationshipId IS NOT NULL THEN 'Child'
ELSE 'Neither'
END
FROM Foo F
LEFT JOIN FooRelationship Parent ON F.FooId = Parent.FooParentId
LEFT JOIN FooRelationship Child ON F.Fooid = Child.FooParentId
I'm not normally a big fan of DISTINCT because it's often used to hide messy data, but I think this is an appropriate use for it.
Be warned it may slow things down dramatically if you are using it across a large number of fields and rows.
If you want to get just these values and then populate the rest of the rows as well, you can do a subquery for the relationship logic:
SELECT s.FooID, s.Relationship, T.*
FROM Table T
INNER JOIN (SELECT DISTINCT
FooId,
CASE
WHEN Parent.FooRelationshipId IS NOT NULL THEN 'Parent'
WHEN Child.FooRelationshipId IS NOT NULL THEN 'Child'
ELSE 'Neither'
END as [Relationship]
FROM Foo F
LEFT JOIN FooRelationship Parent ON F.FooId = Parent.FooParentId
LEFT JOIN FooRelationship Child ON F.Fooid = Child.FooParentId) s
ON s.FooId = t.FooID
Try to use SELECT DISTINCT FooID, ...
it will return just one FooID in case that you mentioned as problematic
Just modify your existing query to do a select distinct instead of a plain select.