Common Table Expression to traverse down hierarchy - sql

The Structure
I have 2 tables that link to each other. One is a set of values and a nullable foreign key that points to the Id of the other table, which contains 2 foreign keys back to the other table.
HierarchicalTable
Id LeftId RightId SomeValue
1 1 2 some value
2 3 4 top level in tree
3 5 6 incorrect hierarchy 1
4 7 8 incorrect result top level
IntermediateTable
Id SomeValue HierarchicalTableId
1 some value NULL
2 value NULL
3 NULL 1
4 value NULL
5 incorrect result 1 NULL
6 incorrect result 3 NULL
7 incorrect result 3 NULL
8 NULL 3
Each table points down the hierarchy. Here is this structure graphed out for the Hierarchical Table records 1 & 2 and their IntermediateTable values:
(H : HierarchicalTable, I : IntermediateTable)
H-2
/ \
I-3 I-4
/
H-1
/ \
I-1 I-2
The Problem
I need to be able to send in an Id for a given HierarchicalTable and get all the HierarchicalTable records below it. So, for the structure above, if I pass 1 into a query, I should just get H-1 (and from that, I can load the related IntermediateTable values). If I pass 2, I should get H-2 and H-1 (and, again, use those to load the relevant IntermediateTable values).
The Attempts
I've tried using a CTE, but there are a few main things that are different from the examples I've seen:
In my structure, the objects point down to their children, instead of up to their parent
I have the Id of the top object, not the Id of the bottom object.
My hierarchy is split across 2 tables. This shouldn't be a big issue once I understand the algorithm to find the results I need, but this could be causing additional confusion for me.
If I run this query:
declare #TargetId bigint = 2
;
with test as (
select h.*
from dbo.hierarchicaltable h
inner join dbo.intermediatetable i
on (h.leftid = i.id or h.rightid = i.id)
union all
select h.*
from dbo.hierarchicaltable h
where h.id = #TargetId
)
select distinct *
from test
I get all 4 records in the HierarchicalTable, instead of just records 1 & 2. I'm not sure if what I want is possible to do with a CTE.

Try this:
I'm build entire tree with both tables, then filter (only hierarchicaltable records).
DECLARE #HierarchicalTable TABLE(
Id INT,
LeftId INT,
RightId INT,
SomeValue VARCHAR(MAX)
)
INSERT INTO #HierarchicalTable
VALUES
(1, 1, 2, 'some value '),
(2, 3, 4, 'top level in tree '),
(3, 5, 6, 'incorrect hierarchy 1 '),
(4, 7, 8, 'incorrect result top level')
DECLARE #IntermediateTable TABLE(
Id INT,
SomeValue VARCHAR(MAX),
HierarchicalTableId INT
)
INSERT INTO #IntermediateTable
VALUES
(1, 'some value' ,NULL ),
(2, 'value ' ,NULL ),
(3, NULL ,1 ),
(4, 'value ' ,NULL ),
(5, 'incorrect result 1' ,NULL ),
(6, 'incorrect result 3' ,NULL ),
(7, 'incorrect result 3' ,NULL ),
(8, NULL ,3 )
DECLARE #TargetId INT = 2;
WITH CTE AS (
SELECT Id AS ResultId, LeftId, RightId, NULL AS HierarchicalTableId
FROM #HierarchicalTable
WHERE Id = #TargetId
UNION ALL
SELECT C.Id AS ResultId, C.LeftId, C.RightId, NULL AS HierarchicalTableId
FROM #HierarchicalTable C
INNER JOIN CTE P ON P.HierarchicalTableId = C.Id
UNION ALL
SELECT NULL AS ResultId, NULL AS LeftId, NULL AS RightId, C.HierarchicalTableId
FROM #IntermediateTable C
INNER JOIN CTE P ON P.LeftId = C.Id OR P.RightId = C.Id
)
SELECT *
FROM CTE
WHERE ResultId IS NOT NULL

Related

TSQL - How to avoid UNION ALL

Sample Data:
DECLARE #Parent TABLE
(
[Id] INT
, [Misc_Val] VARCHAR(5)
) ;
DECLARE #Children TABLE
(
[Id] INT
, [P_ID] INT
) ;
INSERT INTO #Parent
VALUES
( 1, 'One' )
, ( 2, 'Two' )
, ( 3, 'Three' )
, ( 5, 'Four' ) ;
INSERT INTO #Children
VALUES
( 10, 1 )
, ( 11, 1 )
, ( 21, 2 )
, ( 23, 2 )
, ( 30, 3 )
, ( 40, 4 ) ;
Goal:
To efficiently output three fields ( [Id] and [IsChild], [Misc_Val] ). Output all records from #Parent table with [IsChild] = 0 and output all MATCHING records from #Child table (#Parent.Id = #Children.P_Id) with [IsChild] = 1.
Expected Output
Id IsChild Misc_Val
1 0 One
2 0 Two
3 0 Three
5 0 Four
10 1 One
11 1 One
21 1 Two
23 1 Two
30 1 Three
My try:
SELECT [P].[Id]
, 0 AS [IsChild]
, [P].[Misc_Val]
FROM #Parent AS [P]
UNION ALL
SELECT [C].[Id]
, 1
, [P].[Misc_Val]
FROM #Parent AS [P]
JOIN #Children AS [C]
ON [C].[P_ID] = [P].[Id] ;
Is there a better way to do this than using UNION ALL? #Parent and #Children tables are quite big and so am trying to avoid querying the #Parent table twice.
UPDATE: The below answer made me realized something I missed out when creating the post with mocked data. We do need some additional data from #Parent table regardless in the final output.
You can use CROSS APPLY to add the child table to the parent table.
This may or may not be faster, it can depend on indexing and so forth. You need to check the query plan.
SELECT v.Id
, v.IsChild
, P.Misc_Val
FROM #Parent AS P
CROSS APPLY (
SELECT
P.Id,
0 AS IsChild
UNION ALL
SELECT
C.Id,
1
FROM #Children AS C
WHERE C.P_ID = P.Id
) v;
Note that the first SELECT in the apply has no FROM and therefore does not do any table access.

SQL Server query to extract all rows

I've two database tables, one called "Headers" and one called "Rows".
The structure is:
Header: IDPK | Description
Row: IDPK | IDPK_Header | Item_ID | Qty
I need to do a query that says: "From a Header, IDPK find another header that have the same number of rows and the same item ID and quantity".
For example:
Header Rows
IDPK Description IDPK Item_ID Qty
1 'Test1' 1 'A' 10
1 'Test1' 2 'B' 20
2 'Test2' 3 'A' 10
2 'Test2' 4 'B' 20
3 'Test3' 5 'A' 5
3 'Test3' 6 'B' 20
4 'Test4' 7 'A' 10
Header Test1 match Test2 but not Test3 and Test4
The problem is that the number of rows must be exactly the same. I try with ALL operator but without luck.
How I can do the query with an eye for the performance? The two tables can be very huge (~500.000 records).
Assuming there are no duplicates:
with r as (
select r.*, count(*) over (partition by idpk_header) as num_items
from rows r
)
select r1.idpk_header, r2.idpk_header
from r r1 join
r r2
on r1.item_id = r1.item_id and r2.qty = r1.qty and r2.num_items = r1.num_items
group by r1.idpk_header, r2.idpk_header, r1.num_items
having count(*) = r1.num_items;
Basically, this does a self-join on the items, so you only get matches. The on validates that the two have the same number of items. And the having guarantees that all match.
Note: This version returns each match of the header to itself. That is a nice check. You can of course filter this out in the on or a where clause.
If you do have duplicate items, you can simply replace r with:
select idpk_header, item_id, sum(qty) as qty,
count(*) over (partition by idpk_header) as num_items
from rows r
group by idpk_header, item_id;
I woul suggest using a forxml query in order to create the list of items per IDPK. Next I would search for matching item lists and quantities. See following example:
DECLARE #Headers TABLE(
IDPK INT,
Description NVARCHAR(100)
)
DECLARE #Rows TABLE(
IDPK INT,
ITEMID NVARCHAR(1),
Qty INT
)
INSERT INTO #Headers VALUES
(1, 'Test1'),
(2, 'Test2'),
(3, 'Test3'),
(4, 'Test4'),
(5, 'Test5')
INSERT INTO #Rows VALUES
(1, 'A', 10),
(1, 'B', 20),
(2, 'A', 10),
(2, 'B', 20),
(3, 'A', 5 ),
(3, 'B', 20),
(4, 'C', 10),
(5, 'A', 10),
(5, 'C', 20)
;
WITH cteHeaderRows AS(
SELECT IDPK
,ItemIDs=STUFF(
(
SELECT ',' + CAST(ITEMID AS VARCHAR(MAX))
FROM #Rows t2
WHERE t2.IDPK = t1.IDPK
ORDER BY ITEMID, QTY
FOR XML PATH('')
),1,1,''
)
,Qtys=STUFF(
(
SELECT ',' + CAST(Qty AS VARCHAR(MAX))
FROM #Rows t2
WHERE t2.IDPK = t1.IDPK
ORDER BY ITEMID, QTY
FOR XML PATH('')
),1,1,''
)
FROM #Rows t1
GROUP BY IDPK
),
cteFilter AS(
SELECT h1.IDPK AS IDPK1, h2.IDPK AS IDPK2
FROM cteHeaderRows h1
JOIN cteHeaderRows h2 ON h1.IDPK != h2.IDPK AND h1.ItemIDs = h2.ItemIDs AND h2.Qtys = h1.Qtys
)
SELECT DISTINCT h.IDPK, h.Description, r.ItemID, r.Qty
FROM #Headers h
JOIN cteFilter f ON f.IDPK1 = h.IDPK
JOIN #Rows r ON r.IDPK = f.IDPK1
ORDER BY 1,3,4

Write a function or regular expression that will split string in sql

i have in sql table values in this way:
Id GameId GameSupplierId
1 1 NULL
2 2 NULL
3 3 1
4 3 2
5 3 3
What i want is to filter in sql procedure by GameId and if there is GameSupplierId by supplier too. I will get string from my web page in format GameID ; GameSupplierId. For example:
1; NULL
2; NULL
or if there is GameSupplier too
3;1
3;1,2
Also i want to have multiple choice for example like this:
1,2,3;1,2
In my sql query i will then filter like WHERE #GameID = Table.GameID (and also to check #GameSupplierId IN (,,,))
Just add your desired columns into ORDER BY:
ORDER BY t.GameId, t.GameSuplierId
For example:
DECLARE #table TABLE
(
ID INT,
GameId INT,
GameSuplierId INT NULL
)
INSERT INTO #table
(
ID,
GameId,
GameSuplierId
)
VALUES
(1, 1, NULL)
, (2, 2, NULL)
, (3, 3, 1)
, (4, 3, 2)
, (5, 3, 3)
SELECT
*
FROM #table t
ORDER BY t.GameId, t.GameSuplierId

DAX Code change suggestion

So I have this follwing DAX code for a Measure. What I am trying to do is replace the Billdetail[SOurceWasteServiceID] with another column ,BillDetail[SourceServiceMapID]. But the problem is that for a single SourceWasteServiceID, I can have multiple records for SourceServiceMapID. And since the data has to be grouped together, I cant just directly replace the one with other. This table does have an IsCurrent flag in the table, which is "1" for the latest record. I tried to use this IsCurrent in Filter statement but still I get mismatch data.
Anybody have any suggestions on how can I change this?
Thanks in advance for the help!!
Sum of Volume:=CALCULATE(
SUMX(
Summarize(BillDetail
,BillDetail[SourceWasteServiceID]
,BillDetail[ActualBillMonth]
,WasteServiceMap[ContainerCount]
,WasteServiceMap[WasteContainerSizeQuantity]
,WasteServiceMap[WasteContainerSizeUnit]
,WasteServiceMap[WastePickupSchedule]
,WasteServiceMap[WastePickupFrequencyMultiplier]
,WasteServiceMap[PercentFull]
,WasteServiceMap[CompactionRatio]
,"ItemQuantity", CALCULATE(Sum(BillDetail[ActualItemQuantity]),BillDetail[AlternateBillDetailKey] = True)
)
,IF ( UPPER((WasteServiceMap[WastePickupSchedule])) = "FIXED"
,(WasteServiceMap[ContainerCount])
* (WasteServiceMap[WasteContainerSizeQuantity])
*(IF(WasteServiceMap[WastePickupFrequencyMultiplier] = -1,0,WasteServiceMap[WastePickupFrequencyMultiplier]))
* (WasteServiceMap[PercentFull])
* (WasteServiceMap[CompactionRatio])
*IF(UPPER((WasteServiceMap[WasteContainerSizeUnit])) = "GALLONS"
, 0.00495113169
, IF(UPPER((WasteServiceMap[WasteContainerSizeUnit])) = "LITERS"
, 0.00130795062
,IF(UPPER((WasteServiceMap[WasteContainerSizeUnit])) = "YARDS"
,1
,BLANK())
)
)
, IF ( OR(OR(OR(UPPER((WasteServiceMap[WastePickupSchedule])) = "ON CALL" ,UPPER((WasteServiceMap[WastePickupSchedule])) = "MAILBACK"),UPPER((WasteServiceMap[WastePickupSchedule])) = "HAND PICKUP"),UPPER((WasteServiceMap[WastePickupSchedule])) = "SCHEDULED ONCALL")
, (WasteServiceMap[WasteContainerSizeQuantity])
* (WasteServiceMap[CompactionRatio])
* (WasteServiceMap[PercentFull])
* ([ItemQuantity])
*IF(UPPER((WasteServiceMap[WasteContainerSizeUnit])) = "GALLONS"
, 0.00495113169
, IF(UPPER((WasteServiceMap[WasteContainerSizeUnit])) = "LITERS"
, 0.00130795062
,IF(UPPER((WasteServiceMap[WasteContainerSizeUnit])) = "YARDS"
,1
,BLANK())
)
)
, 0
)
)
)
)
You know... example you provided does not look like just a problem related to joining latest records to some "base" records, but... if it IS related despite all that, we can "play" with this problem a little bit. Just for fun.
Let's say we have two very simple tables in our database
create table parent_table
(
parent_id int identity(1, 1) primary key,
some_value nvarchar(100)
);
create table child_table
(
child_id int identity(1, 1) primary key,
parent_id int,
is_current bit,
some_value nvarchar(100)
);
with some meaningless, but related data
insert into parent_table (some_value)
values ('value 1'),('value 2'),('value 3'),('value 4');
insert into child_table (parent_id, is_current, some_value) values
(1, 1, 'value 1.1'),
(2, 0, 'value 2.1'),
(2, 0, 'value 2.2'),
(2, 1, 'value 2.3'),
(3, 0, 'value 3.1'),
(3, 1, 'value 3.2'),
(4, 0, 'value 4.1'),
(4, 1, 'value 4.2');
And... we want to find only current child data for every parent row.
If we wrote a query on T-SQL it could look as something like this
select p.parent_id
, p.some_value [parent_value]
, c.some_value [current_child_value]
from parent_table p
left join child_table c on p.parent_id = c.parent_id
and c.is_current = 1;
(4 row(s) affected)
parent_id parent_value current_child_value
-----------------------------------------------
1 value 1 value 1.1
2 value 2 value 2.3
3 value 3 value 3.2
4 value 4 value 4.2
Now we could try to build some simple tabular model on top on these tables
and write a DAX query against it
evaluate
filter (
addcolumns(
child_table,
"parent_value", related(parent_table[some_value])
),
child_table[is_current] = True
)
having received almost the same results as using T-SQL
child_table[child_id] child_table[parent_id] child_table[is_current] child_table[some_value] [parent_value]
------------------------------------------------------------------------------------------------------------------
8 4 True value 4.2 value 4
6 3 True value 3.2 value 3
4 2 True value 2.3 value 2
1 1 True value 1.1 value 1
I hope it's helpful enough for you to solve your problem or at least it can point you to the right direction

Recursive select in SQL

I have an issue I just can't get my head around. I know what I want, just simply can't get it out on the screen.
What I have is a table looking like this:
Id, PK UniqueIdentifier, NotNull
Name, nvarchar(255), NotNull
ParentId, UniqueIdentifier, Null
ParentId have a FK to Id.
What I want to accomplish is to get a flat list of all the id's below the Id I pass in.
example:
1 TestName1 NULL
2 TestName2 1
3 TestName3 2
4 TestName4 NULL
5 TestName5 1
The tree would look like this:
-1
-> -2
-> -3
-> -5
-4
If I now ask for 4, I would only get 4 back, but if I ask for 1 I would get 1, 2, 3 and 5.
If I ask for 2, I would get 2 and 3 and so on.
Is there anyone who can point me in the right direction. My brain is fried so I appreciate all help I can get.
declare #T table(
Id int primary key,
Name nvarchar(255) not null,
ParentId int)
insert into #T values
(1, 'TestName1', NULL),
(2, 'TestName2', 1),
(3, 'TestName3', 2),
(4, 'TestName4', NULL),
(5, 'TestName5', 1)
declare #Id int = 1
;with cte as
(
select T.*
from #T as T
where T.Id = #Id
union all
select T.*
from #T as T
inner join cte as C
on T.ParentId = C.Id
)
select *
from cte
Result
Id Name ParentId
----------- -------------------- -----------
1 TestName1 NULL
2 TestName2 1
5 TestName5 1
3 TestName3 2
Here's a working example:
declare #t table (id int, name nvarchar(255), ParentID int)
insert #t values
(1, 'TestName1', NULL),
(2, 'TestName2', 1 ),
(3, 'TestName3', 2 ),
(4, 'TestName4', NULL),
(5, 'TestName5', 1 );
; with rec as
(
select t.name
, t.id as baseid
, t.id
, t.parentid
from #t t
union all
select t.name
, r.baseid
, t.id
, t.parentid
from rec r
join #t t
on t.ParentID = r.id
)
select *
from rec
where baseid = 1
You can filter on baseid, which contains the start of the tree you're querying for.
Try this:
WITH RecQry AS
(
SELECT *
FROM MyTable
UNION ALL
SELECT a.*
FROM MyTable a INNER JOIN RecQry b
ON a.ParentID = b.Id
)
SELECT *
FROM RecQry
Here is a good article about Hierarchy ID models. It goes right from the start of the data right through to the query designs.
Also, you could use a Recursive Query using a Common Table Expression.
I'm guessing that the easiest way to accomplish what you're looking for would be to write a recursive query using a Common Table Expression:
MSDN - Recursive Queries Using Common Table Expressions