1 to 1 relationship with exactly one of multiple tables - sql

OK, I hope I'll be able to make clear what my problem is:
I have a database with 5 tables. Let's call them A and B, V_1, V_2, and V_3. A and B represent a list of things to be done. These actions are described in the V_i tables. Now, A represents sort of a template of stuff that has to be done with a certain type of item. B, on the other hand, describes what has to be done (or has been done) with a concrete instance if the abstract item described by A. So in OOP terminology one might say that A represents a class and B represents an instance of A. Whenever something is inserted into table B, the related data from table A is copied, so that it can be modified for that specific item without affecting A.
Okay, so here is the actual problem: How do I model this properly? My main concern is that each record in V_i must not be linked to both A and B. It has to be a 1 to 1 relationship with EITHER A OR B. Also, V_i and V_j must not be linked to the same record in A or B. I have no clue how to do this properly. The current structure looks like this:
A and B have a PK called ID. Each V_i also has a PK called ID and two FKs that referene A or B, let's call them A_ID and B_ID. Now, the current implementation ensures that either A_ID or B_ID is NULL, but not both. However, I was wondering if there is a better way to do this. Additionally, there is the problem that multiple V_i could reference the same entry in A or B.
So, I hope my problem is clear. Is there a way to properly model this with relational databases without relying on external code to enforce the constraints? Thanks for your input in advance.
Best regards
David

In relational theory, one-to-one relationships are generally translated to a single table in the physical model. This single table would contain rows from both tables and you would use check constraints to determine the type of the row. This is by far the simplest way to get reliable 1-to-1 relationships.

First thing: when designing a database, you express relations between records not tables.
You are expressing your problem with an OO point of view. This paradigm cannot be used to design tables (SQL being a declarative language).
Otherwise, you can add constraints on your table ensuring your predicate.
Maybe Oracle offers other possibilities I don't know.

The most common way to model the class - instance relationship in rdbs is
Class = table
Instance = row
Think about it: you insert a new row for each new instance; where you do not insert data, defaults are inserted, which give you class data; and triggers give you class-level behaviours.
Alternatively, give A and B the same primary key, and set the PK of B to be an FK to the PK of A. When a row is included in B, the DBMS will check that a "parent" row exists in A. Probably needs drawing
+--------+ +--------+
|Table A | |Table B |
+--------+ +--------+
|id (PK) |<--|id* (PK)|
|col1 | |colB1 |
| ... | | ... |
+--------+ +--------+

Preface: This is a bad design, as others have noted.
Assumptions:
create table a (a_id number primary key);
create table b (b_id number primary key);
create table v1
(v1_id number primary key, a_id number references a, b_id number references b);
create table v2
(v2_id number primary key, a_id number references a, b_id number references b);
create table v3
(v3_id number primary key, a_id number references a, b_id number references b);
Mandating that in any of the V_i tables that exactly one of the ids from A or B is required (but not both) is pretty easy.
alter table V1
add constraint v1_check check
( (a_id is null and b_id is not null)
or (a_id is not null and b_id is null)
);
If you want to extend that constraint so that exactly one of the ids from A or B is present and that value exists in one and only one row:
create unique index v1_check_unique on v1 ( coalesce (a_id, b_id) );
The hard part is making sure that the ids from A and B exist in one and only one of the V_i tables. That can't be done at DML time, but it can be enforced at commit time.
create materialized view log on v1 with rowid;
create materialized view log on v2 with rowid;
create materialized view log on v3 with rowid;
CREATE MATERIALIZED VIEW CROSS_TABLE
REFRESH FAST ON COMMIT
AS
SELECT V1_ID AS V_ID, 'V1' AS TABLE_NAME, ROWID AS ROW_ID,
COALESCE (A_ID, B_ID) AS OTHER_ID FROM V1
UNION ALL
SELECT V2_ID AS V_ID, 'V2' AS TABLE_NAME, ROWID AS ROW_ID,
COALESCE (A_ID, B_ID) AS OTHER_ID FROM V2
UNION ALL
SELECT V3_ID AS V_ID, 'V3' AS TABLE_NAME, ROWID AS ROW_ID,
COALESCE (A_ID, B_ID) AS OTHER_ID FROM V3
/
ALTER TABLE CROSS_TABLE ADD CONSTRAINT CROSS_TABLE_UNIQUE UNIQUE (OTHER_ID);
This appears to work - but not as awesomely as you'd hope. Oracle can't enforce that uniqueness across the tables at statement time because session A isn't allowed to take into account any other changes other sessions might be making. It can only enforce that uniqueness at commit time.
The following test case fails when run against empty tables - and rolls back the entire transaction, as it can't deduce which is causing the failure. Caveat emptor.
INSERT INTO A VALUES (1);
INSERT INTO B VALUES (1);
INSERT INTO V1 (V1_ID, A_ID, B_ID) VALUES (1, 1, NULL);
INSERT INTO V2 (V2_ID, A_ID, B_ID) VALUES (1, 1, NULL);
COMMIT;

Related

How to insert multiple rows into table B, and update table A's null foreign keys with the new IDs?

I've found a million things sounding kind of similar on StackOverflow, but not my case exactly. I'll simplify as much as possible:
I have two tables as follows:
CREATE TABLE B (id uuid PRIMARY KEY);
CREATE TABLE A (id uuid PRIMARY KEY, b_id uuid REFERENCES b);
There are some NULL values in A.b_id. I am trying to create a migration that does the following:
For every row in A with no b_id, create a new row in B, and assign its id to A.b_id.
How can I accomplish this in one query?
Assuming you want a distinct entry in b for every row with a missing UUID in a:
WITH upd AS (
UPDATE a
SET b_id = gen_random_uuid()
WHERE b_id IS NULL
RETURNING b_id
)
INSERT INTO b (id)
SELECT b_id FROM upd;
db<>fiddle here
This works because it's a single command, and the FK reference is only enforced at the end of the command.
See:
SET CONSTRAINTS ALL DEFERRED not working as expected
Constraint defined DEFERRABLE INITIALLY IMMEDIATE is still DEFERRED?

Simple database table design structure

I have a situation while database designing, A simple issue but needed a working suggestions
My database tables:
TableAees.
TableBees.
Aees can mapped/contain one or more records of table Bees or also can be without any Bees
Aees can also mapped with one or more records of table Aees itself
Here normal primary key and foreign key relationship/hierarchy won't solve the purpose and also worried that parent/child hierarchy may end up in forming a loop between tables and can give a duplicates records on various joins.
Need a better table mapping for above mentioned tables(a,b) which will satisfy 1 and 2 points.
So to avoid such a situation, how the table relationship/hierarchy will be a better approach?
Database used: SQL Server
Thanks for sharing your knowledge.
You seem to describe a many-to-many relationship. If so, you would create a thrid table to store that relationship, like so:
create table a (
a_id int primary key,
...
);
create table b (
b_id int primary key,
...
);
create table ab (
a_id int references a(a_id),
b_id int references b(b_id),
primary key (a_id, b_id)
)
Each a/b tuple is stored on a separate row in bridge table ab.

Database design queries regarding inheritance and foreign key references

I have a query regarding a design problem that I faced …
There is a table A with subtypes B and C. Table A has an attribute type which tells whether the type is B or C. The common attributes of B and C are in A .
The problem is that there are no extra attributes for B .. all attributes required for B are in A already. However , there are extra attributes for C.
Is it an acceptable solution if I make tables A and C only ??… to extract entities of B I will query through the type attribute from table A
Can you refer any material ?
I also had a another confusion where table A has subtypes B,C,D . Table Z has a column that requires a value of primary id of either B or C but NOT D.
I thought of adding the primary id column of A as a foreign key reference to Z’s column and then making a trigger to ensure that the id isn't D ...
Can anyone please comment ?
Thank you !
Many people just enforce all these rules in application code. That is, they "simply" don't insert wrong data. Of course this is very fragile and depends on writing perfect application code at all times. So we want the database to enforce the constraints instead, so that wrong data can never be inserted.
CREATE TABLE A (
id INT PRIMARY KEY,
type CHAR(1) NOT NULL,
unique key (id, type)
);
CREATE TABLE B (
id INT PRIMARY KEY,
type CHAR(1) NOT NULL DEFAULT 'B',
FOREIGN KEY (id, type) REFERENCES A(id, type)
);
If you can force B.type to always be 'B' (CHECK constraint, trigger, or reference a one-row lookup table) then it can of course reference parent rows in A where type='B'. And do something similar in tables C and D, so then each row of A can be referenced by a row from only one sub-type table.
That is, if A.type is 'B' on a given row, and C.type can only be 'C', then no row of C can reference any row where A.type is 'B'.
Now if you want table Z to reference B or C but not D, you can reference by id and type, so Z also has its own type column. You can restrict Z.type by using a lookup table:
CREATE TABLE Ztypes (
type CHAR(1) PRIMARY KEY
);
INSERT INTO Ztypes VALUES ('B'), ('C');
CREATE TABLE Z (
id INT PRIMARY KEY,
Aid INT NOT NULL,
type CHAR(1) NOT NULL,
FOREIGN KEY (Aid, type) REFERENCES A(id, type),
FOREIGN KEY (type) REFERENCES Ztypes(type)
);
You've already got the answer you were looking for. But for other who run across this, it's worth researching two techniques: Class Table Inheritance and Shared Primary Key.
These two techniques used together make it fast, simple and easy to join A's data with either B's or C's data. And in this pattern, B contains only the key, but still contains usefull informaton.
Both of these techiques have their own tags.

ORACLE Table design: M:N table best practice

I'd like to hear your suggestions on this very basic question:
Imagine these three tables:
--DROP TABLE a_to_b;
--DROP TABLE a;
--DROP TABLE b;
CREATE TABLE A
(
ID NUMBER NOT NULL ,
NAME VARCHAR2(20) NOT NULL ,
CONSTRAINT A_PK PRIMARY KEY ( ID ) ENABLE
);
CREATE TABLE B
(
ID NUMBER NOT NULL ,
NAME VARCHAR2(20) NOT NULL ,
CONSTRAINT B_PK PRIMARY KEY ( ID ) ENABLE
);
CREATE TABLE A_TO_B
(
id NUMBER NOT NULL,
a_id NUMBER NOT NULL,
b_id NUMBER NOT NULL,
somevalue1 VARCHAR2(20) NOT NULL,
somevalue2 VARCHAR2(20) NOT NULL,
somevalue3 VARCHAR2(20) NOT NULL
) ;
How would you design table a_to_b?
I'll give some discussion starters:
synthetic id-PK column or combined a_id,b_id-PK (dropping the "id" column)
When synthetic: What other indices/constraints?
When combined: Also index on b_id? Or even b_id,a_id (don't think so)?
Also combined when these entries are referenced themselves?
Also combined when these entries perhaps are referenced themselves in the future?
Heap or Index-organized table
Always or only up to x "somevalue"-columns?
I know that the decision for one of the designs is closely related to the question how the table will be used (read/write ratio, density, etc.), but perhaps we get a 20/80 solution as blueprint for future readers.
I'm looking forward to your ideas!
Blama
I have always made the PK be the combination of the two FKs, a_id and b_id in your example. Adding a synthetic id field to this table does no good, since you never end up looking for a row based on a knowledge of its id.
Using the compound PK gives you a constraint that prevents the same instance of the relationship between a and b from being inserted twice. If duplicate entries need to be permitted, there's something wrong with your data model at the conceptual level.
The index you get behind the scenes (for every DBMS I know of) will be useful to speed up common joins. An extra index on b_id is sometimes useful, depending on the kinds of joins you do frequently.
Just as a side note, I don't use the name "id" for all my synthetic pk columns. I prefer a_id, b_id. It makes it easier to manage the metadata, even though it's a little extra typing.
CREATE TABLE A_TO_B
(
a_id NUMBER NOT NULL REFERENCES A (a_id),
b_id NUMBER NOT NULL REFERENCES B (b_id),
PRIMARY KEY (a_id, b_id),
...
) ;
It's not unusual for ORMs to require (or, in more clueful ORMs, hope for) an integer column named "id" in addition to whatever other keys you have. Apart from that, there's no need for it. An id number like that makes the table wider (which usually degrades I/O performance just slightly), and adds an index that is, strictly speaking, unnecessary. It isn't necessary to identify the entity--the existing key does that--and it leads new developers into bad habits. (Specifically, giving every table an integer column named "id", and believing that that column alone is the only key you need.)
You're likely to need one or more of these indexed.
a_id
b_id
{a_id, b_id}
{b_id, a_id}
I believe Oracle should automatically index {a_id, b_id}, because that's the primary key. Oracle doesn't automatically index foreign keys. Oracle's indexing guidelines are online.
In general, you need to think carefully about whether you need ON UPDATE CASCADE or ON DELETE CASCADE. In Oracle, you only need to think carefully about whether you need ON DELETE CASCADE. (Oracle doesn't support ON UPDATE CASCADE.)
the other comments so far are good.
also consider adding begin_dt and end_dt to the relationship. in this way, you can manage a good number of questions about each relationship through time. (consider baseline issues)

Design question: Filterable attributes, SQL

I have two tables in my database, Operation and Equipment. An operation requires zero or more attributes. However, there's some logic in how the attributes are attributed:
Operation Foo requires equipment A and B
Operation Bar requires no equipment
Operation Baz requires equipment B and either C or D
Operation Quux requires equipment (A or B) and (C or D)
What's the best way to represent this in SQL?
I'm sure people have done this before, but I have no idea where to start.
(FWIW, my application is built with Python and Django.)
Update 1: There will be around a thousand Operation rows and about thirty Equipment rows. The information is coming in CSV form similar to the description above: Quux, (A & B) | (C & D)
Update 2: The level of conjunctions & disjunctions shouldn't be too deep. The Quux example is probably the most complicated, though there appears to be a A | (D & E & F) case.
Think about how you'd model the operations in OO design: the operations would be subclasss of a common superclass Operation. Each subclass would have mandatory object members for the respective equipment required by that operation.
The way to model this with SQL is Class Table Inheritance. Create a common super-table:
CREATE TABLE Operation (
operation_id SERIAL PRIMARY KEY,
operation_type CHAR(1) NOT NULL,
UNIQUE KEY (operation_id, operation_type),
FOREIGN KEY (operation_type) REFERENCES OperationTypes(operation_type)
);
Then for each operation type, define a sub-table with a column for each required equipment type. For example, OperationFoo has a column for each of equipA and equipB. Since they are both required, the columns are NOT NULL. Constrain them to the correct types by creating a Class Table Inheritance super-table for equipment too.
CREATE TABLE OperationFoo (
operation_id INT PRIMARY KEY,
operation_type CHAR(1) NOT NULL CHECK (operation_type = 'F'),
equipA INT NOT NULL,
equipB INT NOT NULL,
FOREIGN KEY (operation_id, operation_type)
REFERENCES Operations(operation_d, operation_type),
FOREIGN KEY (equipA) REFERENCES EquipmentA(equip_id),
FOREIGN KEY (equipB) REFERENCES EquipmentB(equip_id)
);
Table OperationBar requires no equipment, so it has no equip columns:
CREATE TABLE OperationBar (
operation_id INT PRIMARY KEY,
operation_type CHAR(1) NOT NULL CHECK (operation_type = 'B'),
FOREIGN KEY (operation_id, operation_type)
REFERENCES Operations(operation_d, operation_type)
);
Table OperationBaz has one required equipment equipA, and then at least one of equipB and equipC must be NOT NULL. Use a CHECK constraint for this:
CREATE TABLE OperationBaz (
operation_id INT PRIMARY KEY,
operation_type CHAR(1) NOT NULL CHECK (operation_type = 'Z'),
equipA INT NOT NULL,
equipB INT,
equipC INT,
FOREIGN KEY (operation_id, operation_type)
REFERENCES Operations(operation_d, operation_type)
FOREIGN KEY (equipA) REFERENCES EquipmentA(equip_id),
FOREIGN KEY (equipB) REFERENCES EquipmentB(equip_id),
FOREIGN KEY (equipC) REFERENCES EquipmentC(equip_id),
CHECK (COALESCE(equipB, equipC) IS NOT NULL)
);
Likewise in table OperationQuux you can use a CHECK constraint to make sure at least one equipment resource of each pair is non-null:
CREATE TABLE OperationQuux (
operation_id INT PRIMARY KEY,
operation_type CHAR(1) NOT NULL CHECK (operation_type = 'Q'),
equipA INT,
equipB INT,
equipC INT,
equipD INT,
FOREIGN KEY (operation_id, operation_type)
REFERENCES Operations(operation_d, operation_type),
FOREIGN KEY (equipA) REFERENCES EquipmentA(equip_id),
FOREIGN KEY (equipB) REFERENCES EquipmentB(equip_id),
FOREIGN KEY (equipC) REFERENCES EquipmentC(equip_id),
FOREIGN KEY (equipD) REFERENCES EquipmentD(equip_id),
CHECK (COALESCE(equipA, equipB) IS NOT NULL AND COALESCE(equipC, equipD) IS NOT NULL)
);
This may seem like a lot of work. But you asked how to do it in SQL. The best way to do it in SQL is to use declarative constraints to model your business rules. Obviously, this requires that you create a new sub-table every time you create a new operation type. This is best when the operations and business rules never (or hardly ever) change. But this may not fit your project requirements. Most people say, "but I need a solution that doesn't require schema alterations."
Most developers probably don't do Class Table Inheritance. More commonly, they just use a one-to-many table structure like other people have mentioned, and implement the business rules solely in application code. That is, your application contains the code to insert only the equipment appropriate for each operation type.
The problem with relying on the app logic is that it can contain bugs and might insert data the doesn't satisfy the business rules. The advantage of Class Table Inheritance is that with well-designed constraints, the RDBMS enforces data integrity consistently. You have assurance that the database literally can't store incorrect data.
But this can also be limiting, for instance if your business rules change and you need to adjust the data. The common solution in this case is to write a script to dump all the data out, change your schema, and then reload the data in the form that is now allowed (Extract, Transform, and Load = ETL).
So you have to decide: do you want to code this in the app layer, or the database schema layer? There are legitimate reasons to use either strategy, but it's going to be complex either way.
Re your comment: You seem to be talking about storing expressions as strings in data fields. I recommend against doing that. The database is for storing data, not code. You can do some limited logic in constraints or triggers, but code belongs in your application.
If you have too many operations to model in separate tables, then model it in application code. Storing expressions in data columns and expecting SQL to use them for evaluating queries would be like designing an application around heavy use of eval().
I think you should have either a one-to-many or many-to-many relationship between Operation and Equipment, depending on whether there is one Equipment entry per piece of equipment, or per equipment type.
I would advise against putting business logic into your database schema, as business logic is subject to change and you'd rather not have to change your schema in response.
Looks like you'll need to be able to group certain equipment together as either conjunction or disjunction and combine these groups together...
OperationEquipmentGroup
id int
operation_id int
is_conjuction bit
OperationEquipment
id int
operation_equipment_group_id int
equipment_id
You can add ordering columns if that is important and maybe another column to the group table to specify how groups are combined (only makes sense if ordered). But, by your examples, it looks like groups are only conjuncted together.
Since Operations can have one or more piece of equipment, you should use a linking table. Your schema would be like this:
Operation
ID
othercolumn
Equipment
ID
othercolumn
Operation_Equipment_Link
OperationID
EquipmentID
The two fields in the third table can be set up as a composite primary key, so you don't need a third field and can more easily keep duplicates out of the table.
In addition to Nicholai's suggestion I solved a similar problem as following:
Table Operation has an additional field "OperationType"
Table Equipment has an additional field "EquipmentType"
I have an additional table "DefaultOperationEquipmentType" specifying which EquipmentType needs to be include with each OperationType, e.g.
OperationType EquipmentType
==============.=============.
Foo_Type A_Type
Foo_Type B_Type
Baz_Type B_Type
Baz_Type C_Type
My application doesn't need complex conditions like (A or B) because in my business logic both alternative equipments belong to the same type of equipment, e.g. in a PC environment I could have an equipment Mouse (A) or Trackball (B), but they both belong to EquipmentType "PointingDevice_Type"
Hope that helps
Be Aware I have not tested this in the wild. That being said, the best* way I can see to do a mapping is with a denormalized table for the grouping.
*(aside from Bill's way, which is hard to set up, but masterful when done correctly)
Operations:
--------------------
Op_ID int not null pk
Op_Name varchar 500
Equipment:
--------------------
Eq_ID int not null pk
Eq_Name varchar 500
Total_Available int
Group:
--------------------
Group_ID int not null pk
-- Here you have a choice. You can either:
-- Not recommended
Equip varchar(500) --Stores a list of EQ_ID's {1, 3, 15}
-- Recommended
Eq_ID_1 bit
Eq_1_Total_Required
Eq_ID_2 bit
Eq_2_Total_Required
Eq_ID_3 bit
Eq_3_Total_Required
-- ... etc.
Operations_to_Group_Mapping:
--------------------
Group_ID int not null frk
Op_ID int not null frk
Thus, in case X: A | (D & E & F)
Operations:
--------------------
Op_ID Op_Name
1 X
Equipment:
--------------------
Eq_ID Eq_Name Total_Available
1 A 5
-- ... snip ...
22 D 15
23 E 0
24 F 2
Group:
--------------------
Group_ID Eq_ID_1 Eq_1_Total_Required -- ... etc. ...
1 TRUE 3
-- ... snip ...
2 FALSE 0
Operations_to_Group_Mapping:
--------------------
Group_ID Op_ID
1 1
2 1
As loathe as I am to put recursive (tree) structures in SQL, it sounds like this is really what you're looking for. I would use something modeled like this:
Operation
----------------
OperationID PK
RootEquipmentGroupID FK -> EquipmentGroup.EquipmentGroupID
...
Equipment
----------------
EquipmentID PK
...
EquipmentGroup
----------------
EquipmentGroupID PK
LogicalOperator
EquipmentGroupEquipment
----------------
EquipmentGroupID | (also FK -> EquipmentGroup.EquipmentGroupID)
EntityType | PK (all 3 columns)
EntityID | (not FK, but references either Equipment.EquipmentID
or EquipmentGroup.EquipmentGroupID)
Now that I've put forth an arguably ugly schema, allow me to explain a bit...
Every equipment group can either be an and group or an or group (as designated by the LogicalOperator column). The members of each group are defined in the EquipmentGroupEquipment table, with EntityID referencing either Equipment.EquipmentID or another EquipmentGroup.EquipmentGroupID, the target being determined by the value in EntityType. This will allow you to compose a group that consists of equipment or other groups.
This will allow you to represent something as simple as "requires equipment A", which would look like this:
EquipmentGroupID LogicalOperator
--------------------------------------------
1 'AND'
EquipmentGroupID EntityType EntityID
--------------------------------------------
1 1 'A'
...all the way to your "A | (D & E & F)", which would look like this:
EquipmentGroupID LogicalOperator
--------------------------------------------
1 'OR'
2 'AND'
EquipmentGroupID EntityType EntityID
--------------------------------------------
1 1 'A'
1 2 2 -- group ID 2
2 1 'D'
2 1 'E'
2 1 'F'
(I realize that I've mixed data types in the EntityID column; this is just to make it clearer. Obviously you wouldn't do this in an actual implementation)
This would also allow you to represent structures of arbitrary complexity. While I realize that you (correctly) don't wish to overarchitect the solution, I don't think you can really get away with less without breaking 1NF (by combining multiple equipment into a single column).
From what I understood you want to store the equipments in relation to the operations in a way that will allow you to apply your business logic to it later, in that case you'll need 3 tables:
Operations:
ID
name
Equipment:
ID
name
Operations_Equipment:
equipment_id
operation_id
symbol
Where symbol is A, B, C, etc...
If you have the condition like (A & B) | (C & D) you can know which equipment is which easily.