How to optimize a Database Model for a M:N relationship

How to optimize a Database Model for a M:N relationship - sql

Edit 10-Apr-2013
In order to make myself clear I am adding another (simplified) example showing the principle of what I am trying to achieve:
T1 - PERSONHAS T2 - PRODUCTNEED
ANTON has WHEEL CAR need ENGINE
ANTON has ENGINE CAR need WHEEL
ANTON has NEEDLE SHIRT need NEEDLE
BERTA has NEEDLE SHIRT need THREAD
BERTA has THREAD JAM need FRUIT
BERTA has ENGINE JAM need SUGAR
Q3 - PERSONCANMAKE
ANTON canmake CAR
BERTA canmake SHIRT
Q4 - PERSONCANNOTMAKE
ANTON cannotmake SHIRT
ANTON cannotmake FRUIT
BERTA cannotmake CAR
BERTA cannotmake FRUIT
I have T1 and T2 and want to create queries for Q3 and Q4
End Edit 10-Apr-2013
Preface:
In order to create a product (P) I need to have certain generic capabilities (C - like a factory, supply, electricity, water, etc.)
A product manager defines all generic capabilities needed to create his/her product.
In a location (L) I have certain generic capabilities (C)
A location manager defines the capabilities his/her location is able to provide. This could be a clear YES, a clear NO, or the location manager does not list a certain capability at all.
DB Model:
I have created the following root entities
Location (PK: L) - values L1, L2, L3 // in real ca. 250 rows of L
Product (PK: P) - values P1, P2 // in real ca. 150 rows of P
Capability (PK: C) - values C1, C2, C3 // in real ca. 80 rows of C
and the following child (dependent) entities
ProductCapabilityAssignment:P, C (PK: P, C, FK: P, C)
P1 C1
P1 C2
P2 C1
P2 C3
LocationCapabilityAssignment: L, C, Status (Y/N) (PK: L, C, FK: L, C)
L1 C1 Y
L2 C1 Y
L2 C2 Y
L2 C3 N
L3 C1 Y
L3 C2 Y
L3 C3 Y
Task:
The task is to find out whether a certain product can be produced at a certain location, whereby all capabilities defined for the product must be present at that location. In order to answer this I couldn't help myself but to
create a Cartesian Product of Location and ProductCapabilityAssignment (CL_Cart) to ensure that for each location I am listing all possible products with their cpability needs
CREATE VIEW CL_Cart AS
SELECT L.L, PCA.P, PCA.C
FROM Location AS L, ProductCapabilityAssignment AS PCA;
create an outer join between CL_Cart and LocationCapabilityAssignment to match in all capabilities a location can provide
CREATE VIEW Can_Produce AS
SELECT X.L, X.P, X.C, LCA.Status
FROM CL_CArt AS X LEFT JOIN LocationCapabilityAssignment AS LCA ON (X.C = LCA.C) AND (X.L = LCA.L);
so that finaly I get
SELECT L, P, C, Status
FROM Can_Produce;
L1 P1 C1 Y
L1 P1 C2 NULL // C2 not listed for L1
L1 P2 C1 Y
L1 P2 C3 NULL // C3 not listed for L1
L2 P1 C1 Y
L2 P1 C2 Y
L2 P2 C1 Y
L2 P2 C3 N // C3 listed as "No" for L2
L3 P1 C1 Y
L3 P1 C2 Y
L3 P2 C1 Y
L3 P2 C3 Y
meaning that L1 cannot produce neither P1 nor P2, L2 can produce P1, and L3 can produce both P1, P2.
So I can query Can_Produce for a specific product/location and see what I have and what I don't have in terms of capabilities. I also can provide a shortcut overall YES/NO answer by examining Status="N" OR Status is NULL - if so the product cannot be produced.
Question:
For a relational database like MSSQL, MySQL, Oracle (not yet decided and beyond my influence) I am wondering if I have chosen the correct data model for this M:N relationship or if I could do any better. In particular I fear that with ca. 250 locations, 150 products and one product in average being defined by +/- 10 capabilities, so to say a Cartesian product of 375.000 rows, that performance will collapse due to huge memory consumption.
I would also really like to avoid stored procedures.
Any thoughts would be welcome.

--Environment Variables
Declare #Parent table (id int identity(1,1) primary key, Name varchar(20))
Declare #Components table (id int identity(1,1) primary key, Name varchar(20)) Insert into #Components (Name) values ('Engine'),('Wheel'),('Chassis'),('NEEDLE'),('THREAD'),('FRUIT'),('SUGAR')
Declare #Objects table (id int identity(1,1) primary key, Name varchar(20))
Declare #Template table (id int identity(1,1) primary key, Name varchar(20), ObjectID int, ComponentID int)
Insert into #Template (Name, ObjectID, ComponentID)
Select 'Vehicle', O.ID, C.ID from #Objects O, #Components C where O.Name = 'Car' and C.Name in ('Engine','Wheel','Chassis')union
Select 'Clothing', O.ID, C.ID from #Objects O, #Components C where O.Name = 'Shirt' and C.Name in ('Needle','Thread') union
Select 'Confectionary', O.ID, C.ID from #Objects O, #Components C where O.Name = 'JAM' and C.Name in ('FRUIT','SUGAR')
Declare #AvailableMaterials table (id int identity(1,1) primary key, TestType varchar(20), ParentID int, ComponentID int)
--Test Data
Insert into #AvailableMaterials (TestType,ParentID,ComponentID)
Select 'CompleteSet', P.ID, T.ComponentID from #Parent P, #Template T where P.Name = 'Driver' and T.Objectid = (Select ID from #Objects where Name = 'Car') union
Select 'CompleteSet', P.ID, T.ComponentID from #Parent P, #Template T where P.Name = 'Seamstress' and T.Objectid = (Select ID from #Objects where Name = 'Shirt') union
Select 'IncompleteSet', P.ID, T.ComponentID from #Parent P, #Template T where P.Name = 'Confectionarist' and T.Objectid = (Select ID from #Objects where Name = 'Jam')
and T.ComponentID not in (Select ID from #Components where Name = 'FRUIT')
--/*What sets are incomplete?*/
Select *
from #AvailableMaterials
where ID in (
Select SRCData.ID
from #AvailableMaterials SRCData cross apply (Select ObjectID from #Template T where ComponentID = SRCData.ComponentID) ObjectsMatchingComponents
inner join #Template T
on SRCData.ComponentID = T.ComponentID
and T.ObjectID = ObjectsMatchingComponents.ObjectID
cross apply (Select ObjectID, ComponentID from #Template FullTemplate where FullTemplate.ObjectID = T.ObjectID and FullTemplate.ComponentID not in (Select ComponentID from #AvailableMaterials SRC where SRC.ComponentID = FullTemplate.ComponentID)) FullTemplate
)
/*What sets are complete?*/
Select *
from #AvailableMaterials
where ID not in (
Select SRCData.ID
from #AvailableMaterials SRCData cross apply (Select ObjectID from #Template T where ComponentID = SRCData.ComponentID) ObjectsMatchingComponents
inner join #Template T
on SRCData.ComponentID = T.ComponentID
and T.ObjectID = ObjectsMatchingComponents.ObjectID
cross apply (Select ObjectID, ComponentID from #Template FullTemplate where FullTemplate.ObjectID = T.ObjectID and FullTemplate.ComponentID not in (Select ComponentID from #AvailableMaterials SRC where SRC.ComponentID = FullTemplate.ComponentID)) FullTemplate
)
Hi
This is the best I can come up with... It works on the premise that you have to know what the complete set is, to know what's missing. Once you have what's missing, you can tell the complete sets from the incomplete sets.
I doubt this solution will scale well, even if moved to #tables with indexing. Possibly though...
I too would be interested in seeing a cleaner approach. The above solution was developed in a SQL 2012 version. Note cross apply which limits the Cartesian effect somewhat.
Hope this helps.

I'm not sure what database you are using, but here is an example that would work in sql server - shouldn't require many changes to work in other databases...
WITH ProductLocation
AS
(
SELECT P.P,
P.Name as ProductName,
L.L,
L.Name as LocationName
FROM Product P
CROSS
JOIN Location L
),
ProductLocationCapability
AS
(
SELECT PL.P,
PL.ProductName,
PL.L,
PL.LocationName,
SUM(PC.C) AS RequiredCapabilities,
SUM(CASE WHEN LC.L IS NULL THEN 0 ELSE 1 END) AS FoundCapabilities
FROM ProductLocation PL
JOIN ProductCapabilityAssignment PC
ON PC.P = PL.P
LEFT
JOIN LocationCapabilityAssignment LC
ON LC.L = PL.L
AND LC.C = PC.C
GROUP BY PL.P, PL.ProductName, PL.L, PL.LocationName
)
SELECT PLC.P,
PLC.ProductName,
PLC.L,
PLC.LocationName,
CASE WHEN PLC.RequiredCapabilities = PLC.FoundCapabilities THEN 'Y' ELSE 'N' END AS CanMake
FROM ProductLocationCapability PLC
(Not sure if the field names are correct, I couldn't quite make sense of the schema description!)

Related

Filter a join based on multiple rows

I'm trying to write a query that filters a join based on several rows in another table. Hard to put it into words, so I'll provide a cut-back simple example.
Parent
Child
P1
C1
P1
C2
P1
C3
P2
C1
P2
C2
P2
C4
P3
C1
P3
C3
P3
C5
Essentially all rows are stored in the same table, however there is a ParentID allowing one item to link to another (parent) row.
The stored procedure is taking a comma delimited list of "child" codes, and based on whatever is in this list, I need to provide a list of potential siblings.
For example, if the comma delimited list was empty, the returned rows should be C1, C2, C3, C4, C5. If the list is "C2", the returned rows would be C1, C3, C4, and if the list is 'C1, C2', then the only returned row would be c3, c4.
Sample query:
SELECT [S].[ID]
FROM utItem [P]
INNER JOIN utItem [C]
ON [P].[ID] = [C].[ParentID]
INNER JOIN
(
-- Encapsulated to simplify sample.
SELECT [ID]
FROM udfListToRows( #ChildList )
GROUP BY
[ID]
) [DT]
ON [DT].[ID] = [C].[ID]
/*
In the event where I passed in "C2", this would work, it would return C1, C3, C4.
However this falls apart the moment there is more than 1 value in #ChildList. If I pass in "C2, C3", it would return siblings for either. But I only want siblings of BOTH.
**/
INNER JOIN [utItem] [S]
ON [C].[ParentID] = [S].[ParentID]
AND [C].[ID] <> [S].[ID]
WHERE
#ChildList IS NOT NULL
GROUP BY
[S].[ID]
UNION ALL
-- In the event that no #ChildList values are provided, return a full list of possible children (e.g. 1,2,3,4,5).
SELECT [C].[ID]
FROM [utItem] [P]
INNER JOIN [utItem] [C]
ON [P].[ID] = [C].[ParentID]
WHERE
#ChildList IS NULL
GROUP BY
[C].[ID]

Firstly, you can split your data into a table variable for ease of use
DECLARE #input TABLE (NodeId varchar(2));
INSERT #input (NodeId)
SELECT [ID]
FROM udfListToRows( #ChildList ); -- or STRING_SPLIT or whatever
Assuming you already had your data in a proper table variable (rather than a comma-separated list) you can do this
DECLARE #totalCount int = (SELECT COUNT(*) FROM #input);
SELECT DISTINCT
t.Child
FROM (
SELECT
t.Parent,
t.Child,
i.NodeId,
COUNT(i.NodeId) OVER (PARTITION BY t.Parent) matches
FROM YourTable t
LEFT JOIN #input i ON i.NodeId = t.Child
) t
WHERE t.matches = #totalCount
AND t.NodeId IS NULL;
db<>fiddle
This is a kind of relational division
Left-join the input to the main table
Using a window function, calculate how many matches you get per Parent
There must be at least as many matches as there are inputs
We take the distinct Child, excluding the original inputs

SQL Server. T-SQL. CASE IF EXISTS query

I have a classification table that looks like this:
ID CLASSIFICATION
__________________
A1 BOARD
A2 SURFBOARD
A3 SURF
Then I have a category table that looks like this
CATEGORY PARENT INDENT
____________________________________
SURF NULL 3
SURFBOARD SURF 2
BOARD SURFBOARD 1
I want to make a SQL query, that returns this:
INDENT3 INDENT2 INDENT1 ID
______________________________________
SURF NULL NULL A3
SURF SURFBOARD NULL A2
SURF SURFBOARD BOARD A1
Is it possible?. I'm not getting any ideas, seems like I need to loop through the classification table and find if there is indent1, indent2 and indent3. But not sure If I can put a script in a query, or if there is some kind of query I can do to achieve this. Something like
FOREACH CLASSIFICATION
CASE EXIST CATEGORY WITH IDENT1 INDENT 1 ELSE NULL AS IDENT1,
CASE EXIST CATEGORY WITH IDENT2 INDENT 2 ELSE NULL AS IDENT2,
CASE EXIST CATEGORY WITH IDENT1 INDENT 3 ELSE NULL AS IDENT3

If your hierarchy has a maximum of 3 levels a simpler query may be:
select cl.classification as indent3, null as indent2, null as indent1, id
from classification cl
join category ca on ca.category = cl.classification
where ca.indent = 3
union
select ca2.category, cl.classification, null as indent3, id
from classification cl
join category ca on ca.category = cl.classification
join category ca2 on ca.parent = ca2.category
where ca.indent = 2
union
select ca3.category as indent3, ca2.category as indent2, cl.classification as indent1, id
from classification cl
join category ca on ca.category = cl.classification
join category ca2 on ca.parent = ca2.category
join category ca3 on ca2.parent = ca3.category
where ca.indent = 1
If you have an indefinite number of parent/child levels you might be better to search "parent hierarchy with CTE" for a more complicated but flexible method.

Your logic is a bit hard to follow, but this should product the results that you specify:
select max(case when cl.indent = 3 then cl.category end) over () as indent3,
(case when cl.indent < 3
then max(case when cl.indent = 2 and then cl.category end) over ()
end) as indent2,
(case when cl.indent < 2
then max(case when cl.indent = 1 and then cl.category end) over ()
end) as indent1,
cl.id
from category ca join
classification cl
on ca.category = cl.category

People not so god with SQL will tell you to avoid this, SQL gurus consider to treat you as one of the pack.
It somewhat hard to understand your model but I gave it a shot.
Bit tip from the coach, if you need an id, use incremental numbers from the database, it's faster, allow god index and no risk of unexpected duplicates or truncation.
CREATE TABLE cs (id char(10), classification char(20));
CREATE TABLE ct (category char(20), parent char(20), indent int);
INSERT INTO cs VALUES
('A1','BOARD'),
('A2','SURFBOARD'),
('A3','SURF');
INSERT INTO ct VALUES
('SURF',null,3),
('SURFBOARD','SURF',2),
('BOARD','SURFBOARD',1);
select cs.id, ct1.category indent1,ct2.category indent2,ct3.category indent3 from ct ct1
left join ct ct2 on ct2.category = ct1.parent
left join ct ct3 on ct3.category = ct2.parent
left join cs on cs.classification = ct1.category
Link to the code on SQLFiddle

Use Data of 1 table into another one dynamically

I have one table category_code having data like
SELECT Item, Code, Prefix from category_codes
Item Code Prefix
Bangles BL BL
Chains CH CH
Ear rings ER ER
Sets Set ST
Rings RING RG
Yellow GOld YG YG........
I have another table item_categories having data like
select code,name from item_categories
code name
AQ.TM.PN AQ.TM.PN
BL.YG.CH.ME.PN BL.YG.CH.ME.PN
BS.CZ.ST.YG.PN BS.CZ.ST.YG.PN
CR.YG CR.YG.......
i want to update item_categories.name column corresponding to category_code.item column like
code name
BL.YG.CH.ME.PN Bangles.Yellow Gold.Chains.. . . .
Please suggest good solution for that. Thanks in advance.

First, split the code into several rows, join with the category code and then, concat the result to update the table.
Here an example, based on the data you gave
create table #category_code (item varchar(max), code varchar(max), prefix varchar(max));
create table #item_categories (code varchar(max), name varchar(max));
insert into #category_code (item, code, prefix) values ('Bangles','BL','BL'),('Chains','CH','CH'),('Ear rings','ER','ER'), ('Sets','Set','ST'),('Rings','RING','RG'), ('Yellow gold','YG','YG');
insert into #item_categories (code, name) values ('AQ.TM,PN','AQ.TM.PN'),('BL.YG.CH.ME.PN','BL.YG.CH.ME.PN'),('BS.CZ.ST.YG.PN','BS.CZ.ST.YG.PN')
;with splitted as ( -- split the codes into individual code
select row_number() over (partition by ic.code order by ic.code) as id, ic.code, x.value, cc.item
from #item_categories ic
outer apply string_split(ic.code, '.') x -- SQL Server 2016+, otherwise, use another method to split the data
left join #category_code cc on cc.code = x.value -- some values are missing in you example, but can use an inner join
)
, joined as ( -- then joined them to concat the name
select id, convert(varchar(max),code) as code, convert(varchar(max),coalesce(item + ',','')) as Item
from splitted
where id = 1
union all
select s.id, convert(varchar(max), s.code), convert(varchar(max), j.item + coalesce(s.item + ',',''))
from splitted s
inner join joined j on j.id = s.id - 1 and j.code = s.code
)
update #item_categories
set name = substring (j.item ,1,case when len(j.item) > 1 then len(j.item)-1 else 0 end)
output deleted.name, inserted.name
from #item_categories i
inner join joined j on j.code = i.code
inner join (select code, max(id)maxid from joined group by code) mj on mj.code = j.code and mj.maxid = j.id

SQL Server Recursive CTE for finding all the dependencies

I have a table with the following data.
Road City
R1 C1
R2 C2
R3 C1
R3 C3
R4 C3
R4 C5
R5 C5
If R1 is the input I need to get R1, R3, R4 and R5 as the output. This is because R1 belongs to C1 and C1 has R3 and R3 also belongs to C3 which has R4 and similarly R5.
I was trying to make use of CTE recursion but not able to get it to work. I tried stored procedure recursive call but it goes only 30 levels deep.
with tmp1 as (
select ROAD, CITY, 1 as Level from table R1 WHERE ROAD = 1712
UNION ALL
select R2.ROAD, R2.CITY,Level + 1 as Level
from tmp1 INNER JOIN table R2 ON tmp1.CITY = R2.CITY and tmp1.ROAD <> R2.ROAD
)
select * from tmp1
OPTION (maxrecursion 0)
Any thoughts greatly appreciated!

A recursive CTE will not work without some way of breaking cycles. Other database vendors have specific features for disallowing a row to be added twice. Unless something was added in the latest releases, Microsoft SQL Server does not.
The following does not work, because the recursive clause is referring to the CTE twice. (Or it contains a subquery)
WITH recur AS (SELECT Road, City
FROM #Map
WHERE Road = #StartingRoad
--
UNION ALL
--
SELECT next.Road, next.City
FROM #Map next
INNER JOIN recur
ON (recur.City = next.City AND recur.Road <> next.Road)
OR (recur.City <> next.City AND recur.Road = next.Road)
WHERE NOT EXISTS (SELECT NULL
FROM recur test
WHERE test.Road = next.Road AND test.City = next.City))
SELECT *
FROM recur;
Msg 253, Level 16, State 1, Line 36
Recursive member of a common table expression 'recur' has multiple recursive references.
It is possible with a straight forward loop, which you could stick in a stored procedure:
DECLARE #Map TABLE (Road VARCHAR(2), City VARCHAR(2));
INSERT INTO #Map (Road, City)
VALUES ('R1', 'C1')
, ('R2', 'C2')
, ('R3', 'C1')
, ('R3', 'C3')
, ('R4', 'C3')
, ('R4', 'C5')
, ('R5', 'C5');
DECLARE #StartingRoad VARCHAR(2) = 'R1';
DECLARE #Results TABLE (Road VARCHAR(2), City VARCHAR(2));
INSERT INTO #Results (Road, City)
SELECT Road, City
FROM #Map
WHERE Road = #StartingRoad
WHILE (1=1) BEGIN
INSERT INTO #Results (Road, City)
SELECT next.Road, next.City
FROM #Map next
INNER JOIN #Results r
ON (r.City = next.City AND r.Road <> next.Road)
OR (r.City <> next.City AND r.Road = next.Road)
WHERE NOT EXISTS (SELECT NULL
FROM #Results test
WHERE test.Road = next.Road AND test.City = next.City);
IF ##ROWCOUNT = 0
BREAK;
END;
SELECT DISTINCT Road
FROM #Results

This might get you partially there. I am doing a partial Cartesian join to generate the city/road combinations, and artificially restricting the recursion to 20 levels. I can't help but think there is a better way.
WITH cte(road,
city,
connected_city,
connected_road)
AS (
SELECT a.road,
a.city,
b.city AS connected_city,
b.Road connected_road
FROM deleteme a
INNER JOIN deleteme b ON a.City = b.City
WHERE a.road <> b.road)
SELECT DISTINCT
road
FROM cte;
road
R1
R3
R4
R5

Using MS SQL Server 2005, how can I consolidate detail records into a single comma separated list

BACKGROUND:**I am running **MS2005. I have a MASTER table (ID, MDESC) and a DETAIL table (MID, DID, DDESC) with data as follows
1 MASTER_1
2 MASTER_2
1 L1 DETAIL_M1_L1
1 L2 DETAIL_M1_L2
1 L3 DETAIL_M1_L3
2 L1 DETAIL_M2_L1
2 L2 DETAIL_M2_L2
If I join the tables with
SELECT M.*, D.DID FROM MASTER M INNER JOIN DETAIL D on M.ID = D.MID
I get a list like the following:
1 MASTER_1 L1
1 MASTER_1 L2
1 MASTER_1 L3
2 MASTER_2 L1
2 MASTER_2 L2
QUESTION: Is there any way to use a MS SQL select statement to get the detail records into a comma separated list like this:
1 MASTER_1 "L1, L2, L3"
2 MASTER_2 "L1, L2"

You need a function:-
CREATE FUNCTION [dbo].[FN_DETAIL_LIST]
(
#masterid int
)
RETURNS varchar(8000)
AS
BEGIN
DECLARE #dids varchar(8000)
SELECT #dids = COALESCE(#dids + ', ', '') + DID
FROM DETAIL
WHERE MID = #masterid
RETURN #dids
END
Usage:-
SELECT MASTERID, [dbo].[FN_DETAIL_LIST](MASTERID) [DIDS]
FROM MASTER

Thanks to the concept in the link from Bill Karwin, it's the CROSS APPLY that makes it work
SELECT ID, DES, LEFT(DIDS, LEN(DIDS)-1) AS DIDS
FROM MASTER M1 INNER JOIN DETAIL D on M1.ID = D.MID
CROSS APPLY (
SELECT DID + ', '
FROM MASTER M2 INNER JOIN DETAIL D on M2.ID = D.MID
WHERE M1.ID = M2.ID
FOR XML PATH('')
) pre_trimmed (DIDS)
GROUP BY ID, DES, DIDS
RESULTS:
ID DES DIDS
--- ---------- ---------------
1 MASTER_1 L1, L2, L3
2 MASTER_2 L1, L2

coalesce is your friend.
declare #CSL vachar(max)
set #CSL = NULL
select #CSL = coalesce(#CSL + ', ', '') + cast(DID as varchar(8))
from MASTER M INNER JOIN DETAIL D on M.ID = D.MID
select #CSL
This will not work well for a generalized query (i.e. works great for a single master record).
You could drop this into a function... but that may not give you the performance you need/want.

This is the purpose of MySQL's GROUP_CONCAT() aggregate function. Unfortunately, it's not very easy to duplicate this function in other RDBMS brands that don't support it.
See Simulating group_concat MySQL function in Microsoft SQL Server 2005?

I think you need a function for this to work properly in recent version of SQL Server:
http://sqljunkies.com/WebLog/amachanic/archive/2004/11/10/5065.aspx?Pending=true

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to optimize a Database Model for a M:N relationship - sql

Related

Filter a join based on multiple rows

SQL Server. T-SQL. CASE IF EXISTS query

Use Data of 1 table into another one dynamically

SQL Server Recursive CTE for finding all the dependencies

Using MS SQL Server 2005, how can I consolidate detail records into a single comma separated list

Categories

Resources