SQL Server EXISTS query to determine relationship - sql

I have the following tables:
Foo
FooId INT PRIMARY KEY
FooRelationship
FooRelationshipId INT PRIMARY KEY IDENTITY
FooParentId INT FK
FooChildId INT FK
How would I write a query that would return every id from Foo and the status the record (whether it is a parent, a child or neither).
Rules:
A foo will only be a parent or a child or neither.
A foo can be the parent of multiple different foos.
A foo can not be the child of more than one foo.
I originally wrote this query:
SELECT
FooId,
CASE
WHEN Parent.FooRelationshipId IS NOT NULL THEN 'Parent'
WHEN Child.FooRelationshipId IS NOT NULL THEN 'Child'
ELSE 'Neither'
END
FROM Foo F
LEFT JOIN FooRelationship Parent ON F.FooId = Parent.FooParentId
LEFT JOIN FooRelationship Child ON F.Fooid = Child.FooParentId
This is broken because if a Foo is a parent to two other Foos then it returns that id twice.
How can I rewrite this to either not use a join or use an EXISTS or something.

Just use DISTINCT - this is a good use case for it. You can't use EXISTS since you actually need to pull the data from both tables:
SELECT DISTINCT
FooId,
CASE
WHEN Parent.FooRelationshipId IS NOT NULL THEN 'Parent'
WHEN Child.FooRelationshipId IS NOT NULL THEN 'Child'
ELSE 'Neither'
END
FROM Foo F
LEFT JOIN FooRelationship Parent ON F.FooId = Parent.FooParentId
LEFT JOIN FooRelationship Child ON F.Fooid = Child.FooParentId
I'm not normally a big fan of DISTINCT because it's often used to hide messy data, but I think this is an appropriate use for it.
Be warned it may slow things down dramatically if you are using it across a large number of fields and rows.
If you want to get just these values and then populate the rest of the rows as well, you can do a subquery for the relationship logic:
SELECT s.FooID, s.Relationship, T.*
FROM Table T
INNER JOIN (SELECT DISTINCT
FooId,
CASE
WHEN Parent.FooRelationshipId IS NOT NULL THEN 'Parent'
WHEN Child.FooRelationshipId IS NOT NULL THEN 'Child'
ELSE 'Neither'
END as [Relationship]
FROM Foo F
LEFT JOIN FooRelationship Parent ON F.FooId = Parent.FooParentId
LEFT JOIN FooRelationship Child ON F.Fooid = Child.FooParentId) s
ON s.FooId = t.FooID

Try to use SELECT DISTINCT FooID, ...
it will return just one FooID in case that you mentioned as problematic

Just modify your existing query to do a select distinct instead of a plain select.

Related

How to retrieve the properties stored in SQL with multiple inheritance

I'm storing the records in SQL that represent a multiple inheritance relationship similar to the one in C++. Like that:
CREATE TABLE Classes
(
id INTEGER PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE Inheritance
(
class_id INTEGER NOT NULL,
base_class_id INTEGER NOT NULL,
FOREIGN KEY (class_id) REFERENCES Classes(id),
FOREIGN KEY (base_class_id) REFERENCES Classes(id)
);
The classes have properties of two types. These properties are inherited by the classes, but in different ways. The first type type of property whenever defined for the class overrides the value of the same property used in any of base classes. The other type accumulates the value: the property is actually a set of values, each class inherits all values of it's base classes, plus may add an additional (single) value to this set:
CREATE TABLE OverridableValues
(
class_id INTEGER PRIMARY KEY,
value TEXT NOT NULL,
FOREIGN KEY (class_id) REFERENCES Classes(id)
);
CREATE TABLE AccumulableValues
(
class_id INTEGER PRIMARY KEY,
value TEXT NOT NULL,
FOREIGN KEY (class_id) REFERENCES Classes(id)
);
The caveat with OverridableValues: there are no cases when the same property is overridden on different paths of multiple inheritance.
I'm trying to design queries using common table expressions that would return the value/values for a given property and class.
The approach that I'm trying to use is to start from the root (assume for simplicity that there is a single root class), and then to build the tree of paths from the root to every other class. The problem is how to pass the information about properties from the parents to children. For example below is an incorrect attempt to do that:
WITH ParentProperty (id, value) AS
(
SELECT c.id, a.value
FROM Classes c
LEFT JOIN AccumulableValues a
ON a.class_id = c.id
WHERE c.id = 1 --This is the root
UNION ALL
SELECT i.class_id, IFNULL(a.value, ba.value)
FROM ParentProperty p
JOIN Inheritance i
ON i.base_class_id = p.id
LEFT JOIN AccumulableValues a
ON a.class_id = i.class_id
LEFT JOIN AccumulableValues ba
ON ba.class_id = i.base_class_id
)
SELECT id, value
FROM ParentProperty;
I feel like I need one more UNION ALL inside the CTE, which is not allowed. But without it I either miss proper values or inherited ones. So far I've failed to design the query for both types of properties.
I'm using SQLite as my database engine.
Finally I've found a solution. I'm describing it below, but more efficient ones are still welcomed.
Let's start with the Accumulable property. My problem was that I tried to add more than one UNION ALL into a single CTE. I've solved that with adding additional CTE (see the AcquiresFrom)
WITH AcquiresFrom (class_id, from_class_id, value) AS
(
SELECT a.class_id, a.class_id, a.value
FROM AccumulatableValues a
UNION ALL
SELECT i.class_id, i.base_class_id, NULL
FROM Inheritance i
),
ClassProperty (class_id, value) AS
(
SELECT c.id, NULL
FROM Classes c
LEFT JOIN Inheritance i
ON i.class_id = c.id
WHERE i.base_class_id IS NULL
UNION ALL
SELECT a.class_id, IFNULL(a.value, p.value)
FROM ClassProperty p
JOIN AcquiresFrom a
ON (a.from_class_id = p.class_id AND a.from_class_id != a.class_id) OR
(a.class_id = p.class_id AND a.class_id = a.from_class_id AND p.value IS NULL)
)
SELECT DISTINCT class_id, value
FROM ClassProperty
WHERE value IS NOT NULL
ORDER BY class_id;
The AcquiresFrom means the way to aquire the value: the class either introduces a new value (the first clause) or to inherits it (the second clause). The ClassProperty incrementally propagates the values from base classes to derived. The only thing left to do is to eliminate duplicates and NULL values (the last clause SELECT DISTINCT / WHERE value IS NOT NULL).
The overridable property is more complex.
WITH Roots (id, value) AS
(
SELECT c.id, o.value
FROM Classes c
LEFT JOIN Inheritance i
ON i.class_id = c.id
LEFT JOIN OverridableValues o
ON o.class_id = c.id
WHERE i.base_class_id IS NULL
),
PossibleValues (id, acquired_from_id, value) AS
(
SELECT r.id, r.id, r.value
FROM Roots r
UNION ALL
SELECT i.class_id, CASE WHEN o.value IS NULL THEN p.acquired_from_id ELSE i.class_id END, IFNULL(o.value, p.value)
FROM PossibleValues p
JOIN Inheritance i
ON i.base_class_id = p.id
LEFT JOIN OverridableValues o
ON o.class_id = i.class_id
),
Split (class_id, base_class_id, direct) AS (
SELECT i.class_id, i.base_class_id, 1
FROM Inheritance i
UNION ALL
SELECT i.class_id, i.base_class_id, 0
FROM Inheritance i
),
Ancestors (id, ancestor_id) AS (
SELECT r.id, NULL
FROM Roots r
UNION ALL
SELECT s.class_id, CASE WHEN s.direct == 1 THEN a.id ELSE a.ancestor_id END
FROM Ancestors a
JOIN Split s
ON s.base_class_id = a.id
)
SELECT DISTINCT p.id, p.value
FROM PossibleValues p
WHERE p.acquired_from_id NOT IN
(
SELECT a.ancestor_id
FROM PossibleValues p1
JOIN PossibleValues p2
ON p2.id = p1.id
JOIN Ancestors a
ON a.id = p1.acquired_from_id AND a.ancestor_id = p2.acquired_from_id
WHERE p1.id = p.id
);
The Roots is obviously the list of classes that have no parents. The PossibleValues CTE propagates/overrides the values from roots to final classes, and breaks multiple inheritance cycles making the structure a tree-like. All valid id/value pairs are present in the result of this query, however some invalid values are present as well. These invalid values are those that were overridden on one of the branches, but this fact is not known on another branch. The acquired_from_id allows us to reconstruct who was that class that first introduced this value (that may be useful whenever two different classes intruduce the same value).
The last thing left is to resolve the ambiguity caused by multiple inheritance. Knowing the class and two possible values we need to know whether one value overrides the other. That is resolved with the Ancestors expression.

How do I do an SQL query based on a foreign key field?

I have the following tables:
people:
id, name
parent:
id, people_id, name
I have tried the following:
SELECT * FROM people
LEFT JOIN parent ON people.id = parent.people_id
WHERE parent.name != 'Carol';
How do I find all the people whose parent's name is not Carol?
You can try below code
select people.name from people
inner join parent on people.id=parent.people_id
where parent.name not in ('Carol')
If the two tables are to be queried by using Foreign Key.
If you want to get all records from one table that have some related entry in a second table then use Inner join
Select * from People INNER JOIN parent ON people.id = parent.people_id
WHERE parent.name <> 'Carol'
Similarly LEFT JOIN will get all records from the LEFT linked table but if you have selected some columns from the RIGHT table, if there is no related records, these columns will contain NULL
First of all, why would you need two tables? why can't you have a single table named "Person" with ID,Name,ParentID columns
Where ParentID will be optional and reference the ID if it has got parent.
And run the following query
select * from PERSON where Name not like 'Carol%' and ParentID IS NOT NULL;
SELECT * FROM people WHERE EXISTS(SELECT 1 FROM parent WHERE people_id = id AND name <> 'Carol')
First of all the table structure you have taken restrict the future growth. Like in future if you want to add parents of your parents then it wont work in this table structure.
You can do like :
id | parent_id | people_name
Here you can make parent_id null for the parent and add parent_id as id for those who have parent. Here to retrieve you have to use SELF join(join in the same table)
`
Select * from people P
INNER JOIN parent PA ON PA.people_id = P._id
where PA.name not in ('Carol')
`
Difference between INNER JOIN and LEFT OUTER JOIN
is
1) INNER JOIN bring only similar data between two table
for ex if in people table parent_id table is nullable then it will not discard the complete row,but in case of LEFT OUTER JOIN it will bring all the rows from LEFT table as well as related table from right table.with all null in right joined row..

Query that selects the sum of all records that are referenced in another table

I have two tables, parent(id, name, child_id) and child(id, name, number) - not all parents may have childs and not all childs may have parents. I need a query that selects the sum of all records in child table and also selects the sum of only those records that have a parent and those that dont - that is determined by parent tables child_id column. How can this be done?
select
sum(c.number) AS sum AS a,
sum(all_child_records_that_have_a_parent) AS b,
sum(all_child_records_that_do not have a parent) AS c /*do not use a-b if possible*/
from
child c
The "all_child_records_that_have_a_parent" is the one i cant figure out :)
all_child_records_that_do not have a parent:
SELECT *
FROM child
WHERE id NOT IN (SELECT child_id FROM parent)
You can select distinct child ids from the parent table and outer join these to your child table. Then check for NULL.
select
sum(c.number) AS sum_all_c,
sum(case when x.child_id is not null then c.number end) AS sum_c_with_parent,
sum(case when x.child_id is null then c.number end) AS sum_c_without_parent
from child c
left outer join (select distinct child_id from parent) x on x.child_id = c.id;

SQL Server: querying hierarchical and referenced data

I'm working on an asset database that has a hierarchy. Also, there is a "ReferenceAsset" table, that effectively points back to an asset. The Reference Asset basically functions as an override, but it is selected as if it were a unique, new asset. One of the overrides that gets set, is the parent_id.
Columns that are relevant to selecting the heirarchy:
Asset: id (primary), parent_id
Asset Reference: id (primary), asset_id (foreignkey->Asset), parent_id (always an Asset)
---EDITED 5/27----
Sample Relevent Table Data (after joins):
id | asset_id | name | parent_id | milestone | type
3 3 suit null march shape
4 4 suit_banker 3 april texture
5 5 tie null march shape
6 6 tie_red 5 march texture
7 7 tie_diamond 5 june texture
-5 6 tie_red 4 march texture
the id < 0 (like the last row) signify assets that are referenced. Referenced assets have a few columns that are overidden (in this case, only parent_id is important).
The expectation is that if I select all assets from april, I should do a secondary select to get the entire tree branches of the matching query:
so initially the query match would result in:
4 4 suit_banker 3 april texture
Then after the CTE, we get the complete hierarchy and our result should be this (so far this is working)
3 3 suit null march shape
4 4 suit_banker 3 april texture
-5 6 tie_red 4 march texture
and you see, the parent of id:-5 is there, but what is missing, that is needed, is the referenced asset, and the parent of the referenced asset:
5 5 tie null march shape
6 6 tie_red 5 march texture
Currently my solution works for this, but it is limited to only a single depth of references (and I feel the implementation is quite ugly).
---Edited----
Here is my primary Selection Function. This should better demonstrate where the real complication lies: the AssetReference.
Select A.id as id, A.id as asset_id, A.name,A.parent_id as parent_id, A.subPath, T.name as typeName, A2.name as parent_name, B.name as batchName,
L.name as locationName,AO.owner_name as ownerName, T.id as typeID,
M.name as milestoneName, A.deleted as bDeleted, 0 as reference, W.phase_name, W.status_name
FROM Asset as A Inner Join Type as T on A.type_id = T.id
Inner Join Batch as B on A.batch_id = B.id
Left Join Location L on A.location_id = L.id
Left Join Asset A2 on A.parent_id = A2.id
Left Join AssetOwner AO on A.owner_id = AO.owner_id
Left Join Milestone M on A.milestone_id = M.milestone_id
Left Join Workflow as W on W.asset_id = A.id
where A.deleted <= #showDeleted
UNION
Select -1*AR.id as id, AR.asset_id as asset_id, A.name, AR.parent_id as parent_id, A.subPath, T.name as typeName, A2.name as parent_name, B.name as batchName,
L.name as locationName,AO.owner_name as ownerName, T.id as typeID,
M.name as milestoneName, A.deleted as bDeleted, 1 as reference, NULL as phase_name, NULL as status_name
FROM Asset as A Inner Join Type as T on A.type_id = T.id
Inner Join Batch as B on A.batch_id = B.id
Left Join Location L on A.location_id = L.id
Left Join Asset A2 on AR.parent_id = A2.id
Left Join AssetOwner AO on A.owner_id = AO.owner_id
Left Join Milestone M on A.milestone_id = M.milestone_id
Inner Join AssetReference AR on AR.asset_id = A.id
where A.deleted <= #showDeleted
I have a stored procedure that takes a temp table (#temp) and finds all the elements of the hierarchy. The strategy I employed was this:
Select the entire system heirarchy into a temp table (#treeIDs) represented by a comma separated list of each entire tree branch
Get entire heirarchy of assets matching query (from #temp)
Get all reference assets pointed to by Assets from heirarchy
Parse the heirarchy of all reference assets
This works for now because reference assets are always the last item on a branch, but if they weren't, i think i would be in trouble. I feel like i need some better form of recursion.
Here is my current code, which is working, but i am not proud of it, and I know it is not robust (because it only works if the references are at the bottom):
Step 1. build the entire hierarchy
;WITH Recursive_CTE AS (
SELECT Cast(id as varchar(100)) as Hierarchy, parent_id, id
FROM #assetIDs
Where parent_id is Null
UNION ALL
SELECT
CAST(parent.Hierarchy + ',' + CAST(t.id as varchar(100)) as varchar(100)) as Hierarchy, t.parent_id, t.id
FROM Recursive_CTE parent
INNER JOIN #assetIDs t ON t.parent_id = parent.id
)
Select Distinct h.id, Hierarchy as idList into #treeIDs
FROM ( Select Hierarchy, id FROM Recursive_CTE ) parent
CROSS APPLY dbo.SplitIDs(Hierarchy) as h
Step 2. Select the branches of all assets that match the query
Select DISTINCT L.id into #RelativeIDs FROM #treeIDs
CROSS APPLY dbo.SplitIDs(idList) as L
WHERE #treeIDs.id in (Select id FROM #temp)
Step 3. Get all Reference Assets in the branches
(Reference assets have negative id values, hence the id < 0 part)
Select asset_id INTO #REFLinks FROM #AllAssets WHERE id in
(Select #AllAssets.asset_id FROM #AllAssets Inner Join #RelativeIDs
on #AllAssets.id = #RelativeIDs.id Where #RelativeIDs.id < 0)
Step 4. Get the branches of anything found in step 3
Select DISTINCT L.id into #extraRelativeIDs FROM #treeIDs
CROSS APPLY dbo.SplitIDs(idList) as L
WHERE
exists (Select #REFLinks.asset_id FROM #REFLinks WHERE #REFLinks.asset_id = #treeIDs.id)
and Not Exists (select id FROM #RelativeIDs Where id = #treeIDs.id)
I've tried to just show the relevant code. I am super grateful to anyone who can help me find a better solution!
--getting all of the children of a root node ( could be > 1 ) and it would require revising the query a bit
DECLARE #AssetID int = (select AssetId from Asset where AssetID is null);
--algorithm is relational recursion
--gets the top level in hierarchy we want. The hierarchy column
--will show the row's place in the hierarchy from this query only
--not in the overall reality of the row's place in the table
WITH Hierarchy(Asset_ID, AssetID, Levelcode, Asset_hierarchy)
AS
(
SELECT AssetID, Asset_ID,
1 as levelcode, CAST(Assetid as varchar(max)) as Asset_hierarchy
FROM Asset
WHERE AssetID=#AssetID
UNION ALL
--joins back to the CTE to recursively retrieve the rows
--note that treelevel is incremented on each iteration
SELECT A.Parent_ID, B.AssetID,
Levelcode + 1 as LevelCode,
A.assetID + '\' + cast(A.Asset_id as varchar(20)) as Asset_Hierarchy
FROM Asset AS a
INNER JOIN dbo.Batch AS Hierarchy
--use to get children, since the parentId of the child will be set the value
--of the current row
on a.assetId= b.assetID
--use to get parents, since the parent of the Asset_Hierarchy row will be the asset,
--not the parent.
on Asset.AssetId= Asset_Hierarchy.parentID
SELECT a.Assetid,a.name,
Asset_Hierarchy.LevelCode, Asset_Hierarchy.hierarchy
FROM Asset AS a
INNER JOIN Asset_Hierarchy
ON A.AssetID= Asset_Hierarchy.AssetID
ORDER BY Hierarchy ;
--return results from the CTE, joining to the Asset data to get the asset name
---that is the structure you will want. I would need a little more clarification of your table structure
It would help to know your underlying table structure. There are two approaches which should work depending on your environment: SQL understands XML so you could have your SQL as an xml structure or simply have a single table with each row item having a unique primary key id and a parentid. id is the fk for the parentid. The data for the node are just standard columns. You can use a cte or a function powering a calculated column to determin the degree of nesting for each node. The limit is that a node can only have one parent.

Retrieving parent value from mysql self join

I have a table with fields:
id
name
parent_id
grandparent_id
I want to select the id, name, parent name, grandparent name.
Can I do this with a self join? I basically want to retrieve the "name" value where parent_id = id and return one row.
Example:
id name parent_id grandparent_id
-----------------------------------------------
1 Milton NULL NULL
2 Year 3 1 NULL
3 Class A 2 1
So i want to select the 3rd row (id = 3) but instead of returning just the parent_id and grandparent_id, i want the query to return the names of these records based on their ids. Can i create a composite field, say called parent_id_name and grandparent_id_name?
I'm pretty sure what i am doing can be achieved by a self join or sub query, but all of the code i have tried so far has failed to work. Any help would be really appreciated.
This is the query that you asked for:
# By using LEFT JOINs you will be able to read any record,
# even one with missing parent/grand-parent...
SELECT
child.id,
child.name,
parent.id,
parent.name,
gparent.id,
gparent.name
FROM
some_table child
LEFT JOIN some_table parent ON
parent.id = child.parent_id
LEFT JOIN some_table gparent ON
gparent.id = child.grandparent_id
WHERE
child.id = 3
BUT I would also add that the redundancy of having a field grandparent_id does NOT sound right to me...
Your table should be just:
id name parent_id
1 Milton NULL
2 Year 3 1
3 Class A 2
Notice that, if I know that 1 is the parent of 2, I don't need repeat that same information again on record 3...
In this last case, your select could be like this:
SELECT
child.id,
child.name,
parent.id,
parent.name,
gparent.id,
gparent.name
FROM
some_table child
LEFT JOIN some_table parent ON
parent.id = child.parent_id
LEFT JOIN some_table gparent ON
gparent.id = parent.parent_id -- See the difference?
WHERE
child.id = 3
The query would work the same, and you would also have more "normalized" database.
Edit: This is pretty basic stuff, but I guess it is relevant to this answer...
This kind of denormalization (i.e. to have both parent_id and grandparent_id on the same record) should not be used because it allows the database to be inconsistent.
For instance, let's suppose that a new record is inserted:
id name parent_id grandparent_id
1 Milton NULL NULL
2 Year 3 1 NULL
3 Class A 2 1
4 Invalid Rec 2 3
It doesn't make any sense, right? Record 4 is stating that 3 is its grandparent. So, 3 should be the parent of record 2. But that's not what is stated on record 3 itself. Which record is right?
You may think this is an odd error, and that your database will never become like this. But my experience says otherwise - if an error may happen, it will eventually. Denormalization should be avoided, not just because some database guru says so, but because it really increases inconsistencies, and makes maintenance harder.
Of course, denormalized databases may be faster. But, as a rule of thumb, you should think about performance after your system is ready for production, and after you have perceived, by the means of some automated or empirical test, that a bottleneck exists. Believe me, I have seen much worse design choices being justified by wrong performance expectations before...
SELECT t1.name,
MAX(CASE WHEN t2.id = t1.parent_id then t2.name end) as Parent,
MAX(CASE WHEN t2.id = t1.grandparent_id then t2.name end) as GrandParent
FROM your_table t1
LEFT OUTER JOIN your_table t2 ON t2.id IN (t1.parent_id, t1.grandparent_id)
WHERE t1.id = 3
group by t1.id, t1.name
Try like this, this is for single parent
SELECT e.entity_name AS 'entity',
m.entity_name AS 'parent'
FROM table_name AS e
LEFT OUTER JOIN table_name AS m ON e.entity_parent =m.entity_id
or
check the below link :
http://databases.about.com/od/sql/a/selfjoins.htm