How to retrieve the properties stored in SQL with multiple inheritance - sql

I'm storing the records in SQL that represent a multiple inheritance relationship similar to the one in C++. Like that:
CREATE TABLE Classes
(
id INTEGER PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE Inheritance
(
class_id INTEGER NOT NULL,
base_class_id INTEGER NOT NULL,
FOREIGN KEY (class_id) REFERENCES Classes(id),
FOREIGN KEY (base_class_id) REFERENCES Classes(id)
);
The classes have properties of two types. These properties are inherited by the classes, but in different ways. The first type type of property whenever defined for the class overrides the value of the same property used in any of base classes. The other type accumulates the value: the property is actually a set of values, each class inherits all values of it's base classes, plus may add an additional (single) value to this set:
CREATE TABLE OverridableValues
(
class_id INTEGER PRIMARY KEY,
value TEXT NOT NULL,
FOREIGN KEY (class_id) REFERENCES Classes(id)
);
CREATE TABLE AccumulableValues
(
class_id INTEGER PRIMARY KEY,
value TEXT NOT NULL,
FOREIGN KEY (class_id) REFERENCES Classes(id)
);
The caveat with OverridableValues: there are no cases when the same property is overridden on different paths of multiple inheritance.
I'm trying to design queries using common table expressions that would return the value/values for a given property and class.
The approach that I'm trying to use is to start from the root (assume for simplicity that there is a single root class), and then to build the tree of paths from the root to every other class. The problem is how to pass the information about properties from the parents to children. For example below is an incorrect attempt to do that:
WITH ParentProperty (id, value) AS
(
SELECT c.id, a.value
FROM Classes c
LEFT JOIN AccumulableValues a
ON a.class_id = c.id
WHERE c.id = 1 --This is the root
UNION ALL
SELECT i.class_id, IFNULL(a.value, ba.value)
FROM ParentProperty p
JOIN Inheritance i
ON i.base_class_id = p.id
LEFT JOIN AccumulableValues a
ON a.class_id = i.class_id
LEFT JOIN AccumulableValues ba
ON ba.class_id = i.base_class_id
)
SELECT id, value
FROM ParentProperty;
I feel like I need one more UNION ALL inside the CTE, which is not allowed. But without it I either miss proper values or inherited ones. So far I've failed to design the query for both types of properties.
I'm using SQLite as my database engine.

Finally I've found a solution. I'm describing it below, but more efficient ones are still welcomed.
Let's start with the Accumulable property. My problem was that I tried to add more than one UNION ALL into a single CTE. I've solved that with adding additional CTE (see the AcquiresFrom)
WITH AcquiresFrom (class_id, from_class_id, value) AS
(
SELECT a.class_id, a.class_id, a.value
FROM AccumulatableValues a
UNION ALL
SELECT i.class_id, i.base_class_id, NULL
FROM Inheritance i
),
ClassProperty (class_id, value) AS
(
SELECT c.id, NULL
FROM Classes c
LEFT JOIN Inheritance i
ON i.class_id = c.id
WHERE i.base_class_id IS NULL
UNION ALL
SELECT a.class_id, IFNULL(a.value, p.value)
FROM ClassProperty p
JOIN AcquiresFrom a
ON (a.from_class_id = p.class_id AND a.from_class_id != a.class_id) OR
(a.class_id = p.class_id AND a.class_id = a.from_class_id AND p.value IS NULL)
)
SELECT DISTINCT class_id, value
FROM ClassProperty
WHERE value IS NOT NULL
ORDER BY class_id;
The AcquiresFrom means the way to aquire the value: the class either introduces a new value (the first clause) or to inherits it (the second clause). The ClassProperty incrementally propagates the values from base classes to derived. The only thing left to do is to eliminate duplicates and NULL values (the last clause SELECT DISTINCT / WHERE value IS NOT NULL).
The overridable property is more complex.
WITH Roots (id, value) AS
(
SELECT c.id, o.value
FROM Classes c
LEFT JOIN Inheritance i
ON i.class_id = c.id
LEFT JOIN OverridableValues o
ON o.class_id = c.id
WHERE i.base_class_id IS NULL
),
PossibleValues (id, acquired_from_id, value) AS
(
SELECT r.id, r.id, r.value
FROM Roots r
UNION ALL
SELECT i.class_id, CASE WHEN o.value IS NULL THEN p.acquired_from_id ELSE i.class_id END, IFNULL(o.value, p.value)
FROM PossibleValues p
JOIN Inheritance i
ON i.base_class_id = p.id
LEFT JOIN OverridableValues o
ON o.class_id = i.class_id
),
Split (class_id, base_class_id, direct) AS (
SELECT i.class_id, i.base_class_id, 1
FROM Inheritance i
UNION ALL
SELECT i.class_id, i.base_class_id, 0
FROM Inheritance i
),
Ancestors (id, ancestor_id) AS (
SELECT r.id, NULL
FROM Roots r
UNION ALL
SELECT s.class_id, CASE WHEN s.direct == 1 THEN a.id ELSE a.ancestor_id END
FROM Ancestors a
JOIN Split s
ON s.base_class_id = a.id
)
SELECT DISTINCT p.id, p.value
FROM PossibleValues p
WHERE p.acquired_from_id NOT IN
(
SELECT a.ancestor_id
FROM PossibleValues p1
JOIN PossibleValues p2
ON p2.id = p1.id
JOIN Ancestors a
ON a.id = p1.acquired_from_id AND a.ancestor_id = p2.acquired_from_id
WHERE p1.id = p.id
);
The Roots is obviously the list of classes that have no parents. The PossibleValues CTE propagates/overrides the values from roots to final classes, and breaks multiple inheritance cycles making the structure a tree-like. All valid id/value pairs are present in the result of this query, however some invalid values are present as well. These invalid values are those that were overridden on one of the branches, but this fact is not known on another branch. The acquired_from_id allows us to reconstruct who was that class that first introduced this value (that may be useful whenever two different classes intruduce the same value).
The last thing left is to resolve the ambiguity caused by multiple inheritance. Knowing the class and two possible values we need to know whether one value overrides the other. That is resolved with the Ancestors expression.

Related

How do I search a partitioned window function in MSSQL

Scenario:
Identify a students first class with the university and determine if they passed a second (consecutive) class after passing the first class with in 1 year of ending the first class. If the student did not pass a consecutive second class within 1 year of ending the first class, did they pass any other classes within the same timeframe, e.g. third, fourth, fifth class.
Questions stemming from the first portion of the scenario were easy enough to answer with the use of the lead() function to pull up the next consecutive class information to the same row as the first class. However, I am having trouble finding the best way to determine if the student passed any classes within the designated timeframe, i.e. within 1 year of ending the first class.
My Question:
Is there a way to perform a lookup/search within the partition created by the lead() function?
OR
Is it better to create an additional aggregated query based on passing grades and join back to the primary table based on the aforementioned date range and appropriate key(s) using WHERE EXISTS?
Thanks for taking a look...
Indeed the question isn't very clear. So I guessed and came up with this.
CREATE TABLE Class (
Id int NOT NULL PRIMARY KEY IDENTITY(1, 1),
ExamDate date NOT NULL,
PassMark float NOT NULL
)
CREATE TABLE Thing (
Id int NOT NULL PRIMARY KEY IDENTITY(1, 1),
StudentId int NOT NULL,
ClassId int NOT NULL,
Score float NULL,
CONSTRAINT Thing_FK_Class FOREIGN KEY (ClassId) REFERENCES Class(Id)
)
;
WITH ctePasses AS (
-- Get the passes and order them.
SELECT t.*, c.ExamDate,
ROW_NUMBER() OVER (PARTITION BY StudentId ORDER BY c.ExamDate) AS n
FROM Thing t
INNER JOIN Class c ON t.ClassId = c.Id
WHERE ISNULL(t.Score, 0) >= c.PassMark
),
cteIntermediateFails AS (
-- For every student that has a pass get the number of fails that come after it until either the end of the next pass.
SELECT t.StudentId, p.n, COUNT(*) AS IntermediateFails
FROM Thing t
INNER JOIN Class c ON t.ClassId = c.Id
INNER JOIN ctePasses p ON t.StudentId = p.StudentId
LEFT JOIN ctePasses q ON p.StudentId = q.StudentId AND p.n + 1 = q.n
WHERE c.ExamDate > p.ExamDate AND c.ExamDate < q.ExamDate
GROUP BY t.StudentId, p.n
)
SELECT p1.StudentId, p1.ExamDate, p2.ExamDate, f.IntermediateFails
FROM ctePasses p1
LEFT JOIN ctePasses p2 ON p1.StudentId = p2.StudentId AND p2.n = p1.n + 1
LEFT JOIN cteIntermediateFails f ON p1.StudentId = f.StudentId AND p1.n = f.n
WHERE p1.n = 1
Using CTEs rather than subqueries lets me use the first one in the second one.

Identify Duplicate Xml Nodes

I have a set of tables (with several one-many relationships) that form a single "unit". I need to ensure that we weed out duplicates, but determining duplicates requires consideration of all the data.
To make matters worse, the DB in question is still in Sql 2000 compatibility mode, so it can't use any newer features.
Create Table UnitType
(
Id int IDENTITY Primary Key,
Action int not null,
TriggerType varchar(25) not null
)
Create Table Unit
(
Id int IDENTITY Primary Key,
TypeId int Not Null,
Message varchar(100),
Constraint FK_Unit_Type Foreign Key (TypeId) References UnitType(Id)
)
Create Table Item
(
Id int IDENTITY Primary Key,
QuestionId int not null,
Sequence int not null
)
Create Table UnitCondition
(
Id int IDENTITY Primary Key,
UnitId int not null,
Value varchar(10),
ItemId int not null
Constraint FK_UnitCondition_Unit Foreign Key (UnitId) References Unit(Id),
Constraint FK_UnitCondition_Item Foreign Key (ItemId) References Item(Id)
)
Insert into Item (QuestionId, Sequence)
Values (1, 1),
(1, 2)
Insert into UnitType(Action, TriggerType)
Values (1, 'Changed')
Insert into Unit (TypeId, Message)
Values (1, 'Hello World'),
(1, 'Hello World')
Insert into UnitCondition(UnitId, Value, ItemId)
Values (1, 'Test', 1),
(1, 'Hello', 2),
(2, 'Test', 1),
(2, 'Hello', 2)
I've created a SqlFiddle demonstrating a simple form of this issue.
A Unit is considered a Duplicate with all (non-Id) fields on the Unit, and all conditions on that Unit combined are exactly matched in every detail. Considering it like Xml - A Unit Node (containing the Unit info, and a Conditions sub-collection) is unique if no other Unit node exists that is an exact string copy
Select
Action,
TriggerType,
U.TypeId,
U.Message,
(
Select C.Value, C.ItemId, I.QuestionId, I.Sequence
From UnitCondition C
Inner Join Item I on C.ItemId = I.Id
Where C.UnitId = U.Id
For XML RAW('Condition')
) as Conditions
from UnitType T
Inner Join Unit U on T.Id = U.TypeId
For XML RAW ('Unit'), ELEMENTS
But the issue I have is that I can't seem to get the XML for each Unit to appear as a new record, and I'm not sure how to compare the Unit Nodes to look for Duplicates.
How Can I run this query to determine if there are duplicate Xml Unit nodes within the collection?
If you want to determine whether record is duplicate or not, you don't need to combine all values into one string. You can do this with ROW_NUMBER function like this:
SELECT
Action,
TriggerType,
U.Id,
U.TypeId,
U.Message,
C.Value,
I.QuestionId,
I.Sequence,
ROW_NUMBER () OVER (PARTITION BY <LIST OF FIELD THAT SHOULD BE UNIQUE>
ORDER BY <LIST OF FIELDS>) as DupeNumber
FROM UnitType T
Inner Join Unit U on T.Id = U.TypeId
Inner Join UnitCondition C on U.Id = C.UnitId
Inner Join Item I on C.ItemId = I.Id;
If DupeNumber is greater than 1, then record id a duplicate.
give this a try
this would find the pairs not unique
how to build that into you final answer - not sure - but possibly a start
select u1.id, u2.id
from unit as u1
join unit as u2
on ui.ID < u2.id
join UnitCondition uc1
on uc1.unitID = u1.ID
full outer join uc2
on uc2.unitID = u2.ID
and uc2.itemID = uc1.itemID
where uc2.itemID is null or uc1.itemID is null
So, I managed to figure out what I needed to do. It's a little clunky though.
First, you need to wrap the Xml Select statement in another select against the Unit table, in order to ensure that we end up with xml representing only that unit.
Select
Id,
(
Select
Action,
TriggerType,
IU.TypeId,
IU.Message,
(
Select C.Value, I.QuestionId, I.Sequence
From UnitCondition C
Inner Join Item I on C.ItemId = I.Id
Where C.UnitId = IU.Id
Order by C.Value, I.QuestionId, I.Sequence
For XML RAW('Condition'), TYPE
) as Conditions
from UnitType T
Inner Join Unit IU on T.Id = IU.TypeId
WHERE IU.Id = U.Id
For XML RAW ('Unit')
)
From Unit U
Then, you can wrap this in another select, grouping the xml up by content.
Select content, count(*) as cnt
From
(
Select
Id,
(
Select
Action,
TriggerType,
IU.TypeId,
IU.Message,
(
Select C.Value, C.ItemId, I.QuestionId, I.Sequence
From UnitCondition C
Inner Join Item I on C.ItemId = I.Id
Where C.UnitId = IU.Id
Order by C.Value, I.QuestionId, I.Sequence
For XML RAW('Condition'), TYPE
) as Conditions
from UnitType T
Inner Join Unit IU on T.Id = IU.TypeId
WHERE IU.Id = U.Id
For XML RAW ('Unit')
) as content
From Unit U
) as data
group by content
having count(*) > 1
This will allow you to group entire units where the whole content is identical.
One thing to watch out for though, is that to test "uniqueness", you need to guarantee that the data on the inner Xml selection(s) is always the same. To that end, you should apply ordering on the relevant data (i.e. the data in the xml) to ensure consistency. What order you apply doesn't really matter, so long as two identical collections will output in the same order.

SQL Server: querying hierarchical and referenced data

I'm working on an asset database that has a hierarchy. Also, there is a "ReferenceAsset" table, that effectively points back to an asset. The Reference Asset basically functions as an override, but it is selected as if it were a unique, new asset. One of the overrides that gets set, is the parent_id.
Columns that are relevant to selecting the heirarchy:
Asset: id (primary), parent_id
Asset Reference: id (primary), asset_id (foreignkey->Asset), parent_id (always an Asset)
---EDITED 5/27----
Sample Relevent Table Data (after joins):
id | asset_id | name | parent_id | milestone | type
3 3 suit null march shape
4 4 suit_banker 3 april texture
5 5 tie null march shape
6 6 tie_red 5 march texture
7 7 tie_diamond 5 june texture
-5 6 tie_red 4 march texture
the id < 0 (like the last row) signify assets that are referenced. Referenced assets have a few columns that are overidden (in this case, only parent_id is important).
The expectation is that if I select all assets from april, I should do a secondary select to get the entire tree branches of the matching query:
so initially the query match would result in:
4 4 suit_banker 3 april texture
Then after the CTE, we get the complete hierarchy and our result should be this (so far this is working)
3 3 suit null march shape
4 4 suit_banker 3 april texture
-5 6 tie_red 4 march texture
and you see, the parent of id:-5 is there, but what is missing, that is needed, is the referenced asset, and the parent of the referenced asset:
5 5 tie null march shape
6 6 tie_red 5 march texture
Currently my solution works for this, but it is limited to only a single depth of references (and I feel the implementation is quite ugly).
---Edited----
Here is my primary Selection Function. This should better demonstrate where the real complication lies: the AssetReference.
Select A.id as id, A.id as asset_id, A.name,A.parent_id as parent_id, A.subPath, T.name as typeName, A2.name as parent_name, B.name as batchName,
L.name as locationName,AO.owner_name as ownerName, T.id as typeID,
M.name as milestoneName, A.deleted as bDeleted, 0 as reference, W.phase_name, W.status_name
FROM Asset as A Inner Join Type as T on A.type_id = T.id
Inner Join Batch as B on A.batch_id = B.id
Left Join Location L on A.location_id = L.id
Left Join Asset A2 on A.parent_id = A2.id
Left Join AssetOwner AO on A.owner_id = AO.owner_id
Left Join Milestone M on A.milestone_id = M.milestone_id
Left Join Workflow as W on W.asset_id = A.id
where A.deleted <= #showDeleted
UNION
Select -1*AR.id as id, AR.asset_id as asset_id, A.name, AR.parent_id as parent_id, A.subPath, T.name as typeName, A2.name as parent_name, B.name as batchName,
L.name as locationName,AO.owner_name as ownerName, T.id as typeID,
M.name as milestoneName, A.deleted as bDeleted, 1 as reference, NULL as phase_name, NULL as status_name
FROM Asset as A Inner Join Type as T on A.type_id = T.id
Inner Join Batch as B on A.batch_id = B.id
Left Join Location L on A.location_id = L.id
Left Join Asset A2 on AR.parent_id = A2.id
Left Join AssetOwner AO on A.owner_id = AO.owner_id
Left Join Milestone M on A.milestone_id = M.milestone_id
Inner Join AssetReference AR on AR.asset_id = A.id
where A.deleted <= #showDeleted
I have a stored procedure that takes a temp table (#temp) and finds all the elements of the hierarchy. The strategy I employed was this:
Select the entire system heirarchy into a temp table (#treeIDs) represented by a comma separated list of each entire tree branch
Get entire heirarchy of assets matching query (from #temp)
Get all reference assets pointed to by Assets from heirarchy
Parse the heirarchy of all reference assets
This works for now because reference assets are always the last item on a branch, but if they weren't, i think i would be in trouble. I feel like i need some better form of recursion.
Here is my current code, which is working, but i am not proud of it, and I know it is not robust (because it only works if the references are at the bottom):
Step 1. build the entire hierarchy
;WITH Recursive_CTE AS (
SELECT Cast(id as varchar(100)) as Hierarchy, parent_id, id
FROM #assetIDs
Where parent_id is Null
UNION ALL
SELECT
CAST(parent.Hierarchy + ',' + CAST(t.id as varchar(100)) as varchar(100)) as Hierarchy, t.parent_id, t.id
FROM Recursive_CTE parent
INNER JOIN #assetIDs t ON t.parent_id = parent.id
)
Select Distinct h.id, Hierarchy as idList into #treeIDs
FROM ( Select Hierarchy, id FROM Recursive_CTE ) parent
CROSS APPLY dbo.SplitIDs(Hierarchy) as h
Step 2. Select the branches of all assets that match the query
Select DISTINCT L.id into #RelativeIDs FROM #treeIDs
CROSS APPLY dbo.SplitIDs(idList) as L
WHERE #treeIDs.id in (Select id FROM #temp)
Step 3. Get all Reference Assets in the branches
(Reference assets have negative id values, hence the id < 0 part)
Select asset_id INTO #REFLinks FROM #AllAssets WHERE id in
(Select #AllAssets.asset_id FROM #AllAssets Inner Join #RelativeIDs
on #AllAssets.id = #RelativeIDs.id Where #RelativeIDs.id < 0)
Step 4. Get the branches of anything found in step 3
Select DISTINCT L.id into #extraRelativeIDs FROM #treeIDs
CROSS APPLY dbo.SplitIDs(idList) as L
WHERE
exists (Select #REFLinks.asset_id FROM #REFLinks WHERE #REFLinks.asset_id = #treeIDs.id)
and Not Exists (select id FROM #RelativeIDs Where id = #treeIDs.id)
I've tried to just show the relevant code. I am super grateful to anyone who can help me find a better solution!
--getting all of the children of a root node ( could be > 1 ) and it would require revising the query a bit
DECLARE #AssetID int = (select AssetId from Asset where AssetID is null);
--algorithm is relational recursion
--gets the top level in hierarchy we want. The hierarchy column
--will show the row's place in the hierarchy from this query only
--not in the overall reality of the row's place in the table
WITH Hierarchy(Asset_ID, AssetID, Levelcode, Asset_hierarchy)
AS
(
SELECT AssetID, Asset_ID,
1 as levelcode, CAST(Assetid as varchar(max)) as Asset_hierarchy
FROM Asset
WHERE AssetID=#AssetID
UNION ALL
--joins back to the CTE to recursively retrieve the rows
--note that treelevel is incremented on each iteration
SELECT A.Parent_ID, B.AssetID,
Levelcode + 1 as LevelCode,
A.assetID + '\' + cast(A.Asset_id as varchar(20)) as Asset_Hierarchy
FROM Asset AS a
INNER JOIN dbo.Batch AS Hierarchy
--use to get children, since the parentId of the child will be set the value
--of the current row
on a.assetId= b.assetID
--use to get parents, since the parent of the Asset_Hierarchy row will be the asset,
--not the parent.
on Asset.AssetId= Asset_Hierarchy.parentID
SELECT a.Assetid,a.name,
Asset_Hierarchy.LevelCode, Asset_Hierarchy.hierarchy
FROM Asset AS a
INNER JOIN Asset_Hierarchy
ON A.AssetID= Asset_Hierarchy.AssetID
ORDER BY Hierarchy ;
--return results from the CTE, joining to the Asset data to get the asset name
---that is the structure you will want. I would need a little more clarification of your table structure
It would help to know your underlying table structure. There are two approaches which should work depending on your environment: SQL understands XML so you could have your SQL as an xml structure or simply have a single table with each row item having a unique primary key id and a parentid. id is the fk for the parentid. The data for the node are just standard columns. You can use a cte or a function powering a calculated column to determin the degree of nesting for each node. The limit is that a node can only have one parent.

Multiple left joins - keeping the returned row count down?

I'm trying to figure out how best to query against a schema that consists of one central table, plus a number of "attribute" tables (sorry, not sure of the best terminology here) that record one-to-many relationships. In the business layer, each of these tables corresponds to a collection that may contain zero or more elements.
Right now the code I'm looking at retrieves the data by getting a list of values from the master table, then looping over it and querying each of the "accessory" tables to populate these collections.
I'd like to try and get it down to a single query if I can. I tried using multiple LEFT JOINs. But this effectively joins against a cross product of the values in the accessory tables, which leads to an explosion of rows - especially when you add a few more joins. The table in question includes five such relationships, so the number of rows returned for each record is potentially enormous, and almost entirely composed of redundant data.
Here's a smaller synthetic example of some tables, data, the query structure I'm using, and results:
Database structure & data:
create table Containers (
Id int not null primary key,
Name nvarchar(8) not null);
create table Containers_Animals (
Container int not null references Containers(Id),
Animal nvarchar(8) not null,
primary key (Container, Animal)
);
create table Containers_Foods (
Container int not null references Containers(Id),
Food nvarchar(8) not null,
primary key (Container, Food)
);
insert into Containers (Id, Name)
values (0, 'box'), (1, 'sack'), (2, 'bucket');
insert into Containers_Animals (Container, Animal)
values (1, 'monkey'), (2, 'dog'), (2, 'whale'), (2, 'lemur');
insert into Containers_Foods (Container, Food)
values (1, 'lime'), (2, 'bread'), (2, 'chips'), (2, 'apple'), (2, 'grape');
Coupled to a business object like this:
class Container {
public string Name;
public string[] Animals; // may be empty
public string[] Foods; // may be empty
}
And here's the way I'm constructing the query against it:
select c.Name container, a.Animal animal, f.Food food from Containers c
left join Containers_Animals a on a.Container = c.Id
left join Containers_Foods f on f.Container = c.Id;
Which gives these results:
container animal food
--------- -------- --------
box NULL NULL
sack monkey lime
bucket dog apple
bucket dog bread
bucket dog chips
bucket dog grape
bucket lemur apple
bucket lemur bread
bucket lemur chips
bucket lemur grape
bucket whale apple
bucket whale bread
bucket whale chips
bucket whale grape
What I'd like to see instead is a number of rows equal to the maximum number of values associated with the root table on any of the relationships, with empty space filled in with NULLs. That would keep the number of rows returned way, way, way down, while still being easy to transform into objects. Something like this:
container animal food
--------- -------- --------
box NULL NULL
sack monkey lime
bucket dog apple
bucket lemur bread
bucket whale chips
bucket NULL grape
Can it be done?
Why not just return two data sets ordered by container, and then do a logical merge join on them in the client? What you're asking for is going to make the DB engine do a lot more work, with a lot more complicated query, for (to me) small benefit.
It would look something like this. Use two left joins to make sure each data set has at least one instance of all container names, then loop through them simultaneously. Here is some rough pseudocode:
Dim CurrentContainer
If Not Animals.Eof Then
CurrentContainer = Animals.Container
End If
Do While Not Animals.Eof Or Not Foods.Eof
Row = New Couplet(AnimalType, FoodType);
If Animals.Animal = CurrentContainer Then
Row.AnimalType = Animals.Animal
Animals.MoveNext
End If
If Foods.Container = CurrentContainer Then
Row.FoodType = Foods.Food
Foods.MoveNext
End If
If Not Animals.Eof AndAlso Animals.Container <> CurrentContainer _
AndAlso Not Foods.Eof AndAlso Foods.Container <> CurrentContainer Then
CurrentContainer = [Container from either non-Eof recordset]
EndIf
'Process the row, output it, put it in a stack, build a new recordset, whatever.
Loop
However, of course what you're asking for is possible! Here are two ways.
Treat the inputs separately and join on their position:
WITH CA AS (
SELECT *,
Row_Number() OVER (PARTITION BY Container ORDER BY Animal) Pos
FROM Containers_Animals
), CF AS (
SELECT *,
Row_Number() OVER (PARTITION BY Container ORDER BY Food) Pos
FROM Containers_Foods
)
SELECT
C.Name,
CA.Animal,
CF.Food
FROM
Containers C
LEFT JOIN (
SELECT Container, Pos FROM CA
UNION SELECT Container, Pos FROM CF
) P ON C.Id = P.Container
LEFT JOIN CA
ON C.Id = CA.Container
AND P.Pos = CA.Pos
LEFT JOIN CF
ON C.Id = CF.Container
AND P.Pos = CF.Pos;
Concatenate the inputs vertically and pivot them:
WITH FoodAnimals AS (
SELECT
C.Name,
1 Which,
CA.Animal Item,
Row_Number() OVER (PARTITION BY C.Id ORDER BY (CA.Animal)) Pos
FROM
Containers C
LEFT JOIN Containers_Animals CA
ON C.Id = CA.Container
UNION
SELECT
C.Name,
2 Which,
CF.Food,
Row_Number() OVER (PARTITION BY C.Id ORDER BY (CF.Food)) Pos
FROM
Containers C
LEFT JOIN Containers_Foods CF
ON C.Id = CF.Container
)
SELECT
P.Name,
P.[1] Animal,
P.[2] Food
FROM
FoodAnimals FA
PIVOT (Max(Item) FOR Which IN ([1], [2])) P;
; with a as (
select ID, c.Name container, a.Animal animal
, r=row_number()over(partition by c.ID order by a.Animal)
from Containers c
left join Containers_Animals a on a.Container = c.Id
)
, b as (
select ID, c.Name container, f.Food food
, r=row_number()over(partition by c.ID order by f.Food)
from Containers c
left join Containers_Foods f on f.Container = c.Id
)
select a.container, a.animal, b.food
from a
left join b on a.container=b.container and a.r=b.r
union
select b.container, a.animal, b.food
from b
left join a on a.container=b.container and a.r=b.r
WITH
ca_ranked AS (
SELECT
*,
rnk = ROW_NUMBER() OVER (PARTITION BY Container ORDER BY Animal)
FROM Containers_Animals
),
cf_ranked AS (
SELECT
*,
rnk = ROW_NUMBER() OVER (PARTITION BY Container ORDER BY Food)
FROM Containers_Foods
)
SELECT
container = c.Name,
animal = ca.Animal,
food = cf.Food
FROM ca_ranked ca
FULL JOIN cf_ranked cf ON ca.Container = cf.Container AND ca.rnk = cf.rnk
RIGHT JOIN Containers c ON c.Id = COALESCE(ca.Container, cf.Container)
;

SQL Server EXISTS query to determine relationship

I have the following tables:
Foo
FooId INT PRIMARY KEY
FooRelationship
FooRelationshipId INT PRIMARY KEY IDENTITY
FooParentId INT FK
FooChildId INT FK
How would I write a query that would return every id from Foo and the status the record (whether it is a parent, a child or neither).
Rules:
A foo will only be a parent or a child or neither.
A foo can be the parent of multiple different foos.
A foo can not be the child of more than one foo.
I originally wrote this query:
SELECT
FooId,
CASE
WHEN Parent.FooRelationshipId IS NOT NULL THEN 'Parent'
WHEN Child.FooRelationshipId IS NOT NULL THEN 'Child'
ELSE 'Neither'
END
FROM Foo F
LEFT JOIN FooRelationship Parent ON F.FooId = Parent.FooParentId
LEFT JOIN FooRelationship Child ON F.Fooid = Child.FooParentId
This is broken because if a Foo is a parent to two other Foos then it returns that id twice.
How can I rewrite this to either not use a join or use an EXISTS or something.
Just use DISTINCT - this is a good use case for it. You can't use EXISTS since you actually need to pull the data from both tables:
SELECT DISTINCT
FooId,
CASE
WHEN Parent.FooRelationshipId IS NOT NULL THEN 'Parent'
WHEN Child.FooRelationshipId IS NOT NULL THEN 'Child'
ELSE 'Neither'
END
FROM Foo F
LEFT JOIN FooRelationship Parent ON F.FooId = Parent.FooParentId
LEFT JOIN FooRelationship Child ON F.Fooid = Child.FooParentId
I'm not normally a big fan of DISTINCT because it's often used to hide messy data, but I think this is an appropriate use for it.
Be warned it may slow things down dramatically if you are using it across a large number of fields and rows.
If you want to get just these values and then populate the rest of the rows as well, you can do a subquery for the relationship logic:
SELECT s.FooID, s.Relationship, T.*
FROM Table T
INNER JOIN (SELECT DISTINCT
FooId,
CASE
WHEN Parent.FooRelationshipId IS NOT NULL THEN 'Parent'
WHEN Child.FooRelationshipId IS NOT NULL THEN 'Child'
ELSE 'Neither'
END as [Relationship]
FROM Foo F
LEFT JOIN FooRelationship Parent ON F.FooId = Parent.FooParentId
LEFT JOIN FooRelationship Child ON F.Fooid = Child.FooParentId) s
ON s.FooId = t.FooID
Try to use SELECT DISTINCT FooID, ...
it will return just one FooID in case that you mentioned as problematic
Just modify your existing query to do a select distinct instead of a plain select.