Adding a unique constraint for a relationship 2 degrees away

Adding a unique constraint for a relationship 2 degrees away - sql

Consider this database with 4 tables (primary-keys in asterisks):
Products( *ProductId*, SkuText, ... )
ProductRevisions( ProductId, *RevisionId*, ... )
Orders( *OrderId*, ... )
OrderItems( *OrderId*, *ProductRevisionId*, Quantity, ... )
The idea being that a Product SKU can have multiple revisions (e.g. a 2016 version of a product compared to its 2015 version). The business rules are such that an Order for a Product can only have a single ProductRevision, e.g. an order cannot request both the 2014 and 2016 versions of the same product, they can only have the 2014 or 2016 version.
Ordinarily this wouldn't be a problem: the OrderItems table would have a ProductId column with a UNIQUE constraint on OrderId and ProductId. However because OrderItems's references ProductRevisionId (so the reference to the ProductId is indirect) it means a simple UNIQUE constraint fails and the schema would accept the following data, even though it is invalid as-per the business rules:
Products
ProductId, SkuText
1, 'Kingston USB Stick'
ProductRevisions
ProductId, RevisionId, ...
1, 1, '2014 model'
1, 2, '2016 model'
Orders
OrderId
1
OrderItems
OrderId, ProductRevisionId, Quantity
1, 1, 100
1, 2, 50 -- Invalid data! Two revisions of the same Product should not be in the same order.
What I need is something like this:
ALTER TABLE OrderItems
ADD CONSTRAINT UNIQUE ( OrderId, SELECT ProductId FROM ProductRevisions WHERE RevisionId = OrderItems.ProductRevisionId )
I don't want to denormalize my OrderItems table by adding an explicit ProductId column because that adds a potential point of failure if the parent/child relationship between a given ProductId and ProductRevisionId were to change then the data becomes invalid.
What are my options?

This is really more of a comment, but it is too long.
One option is to create a trigger. This allows you to validate the data using any rules that you want. However, triggers are cumbersome and unnecessary.
Another option is essentially what you say: include both Product and ProductRevision in OrderLines. However, this doesn't quite solve the problem. You need to ensure that the product actually matches the product on the revision.
I am thinking that the best option might be to have a Revision column in ProductRevisions. So, this table would have:
ProductRevisionId -- primary key for the table
ProductId
RevisionId
unique constraint on (ProductId, RevisionId)
The foreign key constraint in OrderLines can then have two columns in it -- (ProductId, RevisionId). Then a unique constraint on (OrderId, ProductId) ensures only one revision.
The downside to this method is that a product can only appear on only one line in each order. However, you don't need triggers.

You can create an indexed view to enforce the constraint. In your case, it'd be something like:
create view [OrderItemProductRevisions]
with schemabinding
as
select oi.OrderID, pr.ProductID
from dbo.OrderItems as oi
join dbo.ProductRevisions as pr
on oi.ProductRevisionID = pr.ProductRevisionID
go
create unique clustered index [CUIX_OrderItemProductRevisions]
on [OrderItemProductRevisions] (OrderID, ProductID)
go
Now, if you try to add two revisions of the same product to the same order, you should violate the unique index on the view and it will be disallowed.

Related

MSSQL - is there a way to automatically add an entry to one table when an entry is added to another?

I'm a bit of an SQL novice, so please bear with me on this one. My project is as follows:
Using MSSQL on Windows Server 2008 R2.
There is an existing database table - let's call it PRODUCTS - which contains several thousand rows of data, which is the product information for every product Company X sells. The three columns I am interested in are ITEMGROUPID, ITEMID and ITEMNAME. The ITEMID is the primary key for this table, and is a unique product code. ITEMGROUPID indicates what category of product each item falls into, and ITEMNAME is self explanatory. I am only interested in one category of product, so by using ITEMGROUPID I can determine how many rows my table will have (currently 260).
I am now creating a table containing some parameters for making each of these products - let's call it LINEPARAMETERS. For example, when we make Widget A, we need Conveyor B to run at Speed C. I intend to create a foreign key in my table, pointing to the ITEMID in the other table. So each row in my new table will reference a specific product in the existing product database.
My question is, if a new product is developed that matches my criteria (ITEMGROUPID = 'VALUE'), and entered into the existing table with an ITEMID, is there any way for my table to automatically generate a new row with that ITEMID and default values in all other columns?

You could create a trigger that fires on insert to product and inserts a row into the lineparameters, like this:
create trigger line_parameter_inserter
on products
after insert
as
insert into lineparameters (productId, col1, col2)
values (inserted.id, 'foo', 'bar');
but a better option is to create a foreign key from the product table to your group defaults table, that way a row must exist in the defaults table before you insert the product table, like this:
create table lineparameters (
id int,
col1 int,
...,
primary key (id)
)
create table products (
id int,
lineparametersId int not null,
...
primary key (id),
foreign key (lineparametersId) references lineparameters(id)
)
This will create a solid process and ensures that even if someone (silently) disables/deletes the trigger, you won't have data integrity problems.

Normalizing Data and Insert Into SQL

I have what may be a stupid question but here it goes.
I have an ORDER_T table, a CUSTOMER_T table, and a ORDERLINE_T table.
I also have a set of data I need to normalize. Each record in this "bad data" has up to 3 items stored in it in attributes called Item1, Item2, and Item3. I thought I was normalizing it correctly by taking each item, separating it, and having it constitute it's own record was good. For example
ORDER_T
OrderID ItemID ItemDescription CustomerID
1 1001 Apple 100
1 1002 Grape 100
1 1003 Pear 100
OrderID is the PK and CustomerID is the FK. I realize thought as I tried to INSERT INTO my DB that it complained of multiple duplicate records via the PK. Duh--that makes sense. Now my question is:
I believe I am wrong but what would be the correct way to normalize data (to the third form) where each OrderID consists of multiple items? Is having attributes such as Item1, Item2, Item3, etc. "bad form" where it is not scalable and statically set like that? Am I overthinking it and should have simply left it alone?
I just believe I need some direction and I'll be good to go.

you need next tables:
all unique customers
customers:
CustomerId (PK)
Name
all unique items
Items:
ItemId (PK)
ItemName
all unique orders:
Orders:
OrderId (PK)
CustomerID (FK)
OrderDate
and then you need many-to-many relationship table:
OrderItems:
OrderId (FK)
ItemId (FK)
count
primary key (OrderId, ItemId)
then you will be able to insert order (which can be empty), then add/remove items from this order via OrderItems table

Primary key VS Foreign key

Im a newbie here
I created one table with primary key customer_id , and another table with a foreign key customer_id to join it to the first table
my question
when I want to enter data in the two tables , should I insert the customer_id twice ( one in the first table and the other in the second ) .
should I do that in every time I insert data ??
thanks :)

Your CustomerId table represents each customer in the Customer table. So whenever a new customer arrives, you create an id for that customer.
For other tables that "relate" to the customer, you insert a customer_id for each entry.
E.g.
Customer
CustomerId, CustomerName
Each customer has a unique id..
ProductSold
ProductId, ProductName, CustomerId
You can now tell which customer bought a product because of the foreign key in the Product table.
So for each product, you insert the customer's id that bought it. I hope that makes sense.
-- A new customer, requires a new id (when you insert a new customer)
-- A product bought by customer, requires a foreign CustomerId to identify its buyer.
So 2 CustomerId inserts.
So yes.. you are right lol :P

SQL One-to-One Relationship Definition

I'm designing a database and I'm not sure how to define one of the relationships. Here's the situation:
An invoice is created
If the product is not in stock then it needs to be manufactured and so a work order is created.
The relationship is one-to-one. However work orders are sometimes created for other purposes so the WorkOrder table will also be linked to other tables in a similar one-to-one relationship. Also, some Invoices won't have a work order at all. This means I can't define these relationships in the normal way by using the same primary key in both tables. Instead of doing this I've created a linking table and then set unique indexes on both fields to define the one-to-one relationship (see image).
(source: markevans.org)
.
Is this the best way?
Cheers
Mark
EDIT: I just realised that this design will allow a single work order to be linked to an invoice and also to one of the other tables I mentioned via 2 linking tables. I guess no solution is perfect.

Okay, this answer is SQL Server specific, but should be adaptable to other RDBMSs, with a little work. So far as I see, we have the following constraints:
An invoice may be associated with 0 or 1 Work Orders
A Work Order must be associated with an invoice or an ABC or a DEF
I'd design the WorkOrder table as follows:
CREATE TABLE WorkOrder (
WorkOrderID int IDENTITY(1,1) not null,
/* Other Columns */
InvoiceID int null,
ABCID int null,
DEFID int null,
/* Etc for other possible links */
constraint PK_WorkOrder PRIMARY KEY (WorkOrderID),
constraint FK_WorkOrder_Invoices FOREIGN KEY (InvoiceID) references Invoice (InvoiceID),
constraint FK_WorkOrder_ABC FOREIGN KEY (ABCID) references ABC (ABCID),
/* Etc for other FKs */
constraint CK_WorkOrders_SingleFK CHECK (
CASE WHEN InvoiceID is null THEN 0 ELSE 1 END +
CASE WHEN ABCID is null THEN 0 ELSE 1 END +
CASE WHEN DEFID is null THEN 0 ELSE 1 END
/* + other FK columns */
= 1
)
)
So, basically, this table is constrained to only FK to one other table, no matter how many PKs are defined. If necessary, a computed column could tell you the "Type" of item that this is linked to, based on which FK column is non-null, or the type and a single int column could be real columns, and InvoiceID, ABCID, etc could be computed columns.
The final thing to ensure is that an invoice only has 0 or 1 Work Orders. If your RDMBS ignores nulls in unique constraints, this is as simple as applying such a constraint to each FK column. For SQL Server, you need to use a filtered index (>=2008) or an indexed view (<=2005). I'll just show the filtered index:
CREATE UNIQUE INDEX IX_WorkItems_UniqueInvoices on
WorkItem (InvoiceID) where (InvoiceID is not null)
Another way to deal with keeping WorkOrders straight is to include a WorkOrder type column in WorkOrder (e.g. 'Invoice','ABC','DEF'), including a computed or column constrained by check constraint to contain the matching value in the link table, and introduce a second foreign key:
CREATE TABLE WorkOrder (
WorkOrderID int IDENTITY(1,1) not null,
Type varchar(10) not null,
constraint PK_WorkOrder PRIMARY KEY (WorkOrderID),
constraint UQ_WorkOrder_TypeCheck UNIQUE (WorkOrderID,Type),
constraint CK_WorkOrder_Types CHECK (Type in ('INVOICE','ABC','DEF'))
)
CREATE TABLE Invoice_WorkOrder (
InvoiceID int not null,
WorkOrderID int not null,
Type varchar(10) not null default 'INVOICE',
constraint PK_Invoice_WorkOrder PRIMARY KEY (InvoiceID),
constraint UQ_Invoice_WorkOrder_OrderIDs UNIQUE (WorkOrderID),
constraint FK_Invoice_WorkOrder_Invoice FOREIGN KEY (InvoiceID) references Invoice (InvoiceID),
constraint FK_Invoice_WorkOrder_WorkOrder FOREIGN KEY (WorkOrderID) references WorkOrder (WorkOrderID),
constraint FK_Invoice_WorkOrder_TypeCheck FOREIGN KEY (WorkOrderID,Type) references WorkOrder (WorkOrderID,Type),
constraint CK_Invoice_WorkOrder_Type CHECK (Type = 'INVOICE')
)
The only disadvantage to this model, although closer to your original proposal, is that you can have a work order that isn't actually linked to any other item (although it claims to be for an e.g INVOICE).

What you have looks to be a perfectly normal way to construct your tables.
If you think you might like to use only one link table between your WorkOrder table and whatever other tables that may have WorkOrders, you could use a link table like:
WorkOrders
OtherId (Could be InvoiceId, or an ID for SomethingElse that may have a WorkOrder)
OtherType (ENUM - something like 'Invoice', 'SomethingElse')
WorkOrderId

So the issue is that you can have invoices that don't have work orders and work orders that don't have invoices but the two need to be linked when there is a link. I would say based upon that description that your database diagram is pretty good. This would open you up to allowing more than a one-to-one relationship. This way down the road you can consider having two work orders for one invoice. You might also have one work order that handles two invoices. This opens you up to a lot of possibilities that you may not need now but that you might in the future.
I would recommend your current design. In the future, you may want to add more information about the link between invoice and work order. This middle table will allow you to add this information.
In the interest of fairness to the other side of the coin, you do need to consider speed/number of tables/etc. that this will cause. For example, you have now created a third table which increased your table count by 50% in this example. Look at the rest of your database. If you did this everywhere, you would probably have the most normalized database but it might not be the most performant because of all the joins that are necessary. Basically, this isn't a "one-size-fits-all" solution. Instead it is a design choice. Personally, I hate nullable foreign key fields. I find they don't give me the granularity I usually want with my database designs.

Your schema corresponds to a many-to-many link between the 2 tables. You are de facto opening here the possibility to have one work order for multiple invoices, and multiple work orders for one invoice. The model offers then possibilities far above the rules you are setting.
You could use a simpler schema, that will reflect the (0,1) relation between work orders and invoices, and the (0,1) relation between Invoices and Work orders:
a Work Order can be independant from
an invoice, or linked to one specific
invoice: it has a (0,1) relation to Invoice table
An invoice can have no work orders, or one work orders: it has a (0,1) relation to Work Orders Table
Such a relation can be translated by the following model and rules
Invoice
id_Invoice, Primary Key
WorkOrder
id_WorkOrder, Primary Key
id_Invoice, Foreign Key, Nulls accepted, unique value
With such a structure, it will be easy to add new 'dependants' to work orders table. If, for example, you want to open the possibility to launch work orders from restocking orders (where you want to have minimal quantities of some items in stock), you can then just add the corresponding field to the WorkOrder table:
id_RestockingOrder, ForeignKey, Nulls accepted, unique value
You'll be then able to 'see' from where your WorkOrder comes: an invoice, a restocking order, etc.
Seems it corresponds to your needs.
Edit:
as noted by #mark, SQL Server will not allow multiple null values, in contradiction with ANSI specs (check here for some more details), As we do not want to wait for SQL Server 2011 to have this rule implemented, there is a workaround here, where you can build a view excluding the null values and set a unique index on this view. I must admit that I did not like this solution ...
There is still the possibility to implement the 'unique if not null' rule in your code. It will still be simpler than implementing the many-to-many model (with the Invoice_WorkOrder table) you are proposing and manage all additional unicity rules that you'll need to implement.

There is no real need for the link table, just have them linked directly and allow for NULL in the reference field of the work order. Because a work order can be linked to multiple tables what I would do is add a reference id on every work order to every table that can link from it. So you would have:
Invoice
PK - ID
FK - WorkOrderID
SomeOtherTable
PK - ID
FK - WorkOrderID
WorkOrder
PK - ID
FK - InvoiceID (allow NULL)
FK - SomeOtherTableID (allow NULL)
To make sure a WorkOrder is linked to only one item, you have to use code to validate the row (or perhaps a stored procedure which I cannot come up with right now).
EDIT: PS, if you want to use a link table, give it a generic name and add all the linked tables with the same sort of construct I just described allowing for NULL's. In my eyes adding the extra table makes the schema larger than it needs to be, but if a work order contains a lot of big text fields it could increase performance slightly and reduce database size with all the indexes flying around. In anything but the largest applications, I would consider it over-normalization though, but that is a matter of style.

How do I check constraints between two tables when inserting into a third table that references the other two tables?

Consider this example schema:
Customer ( int CustomerId pk, .... )
Employee ( int EmployeeId pk,
int CustomerId references Customer.CustomerId, .... )
WorkItem ( int WorkItemId pk,
int CustomerId references Customer.CustomerId,
null int EmployeeId references Employee.EmployeeId, .... )
Basically, three tables:
A customer table with a primary key and some additional columns
A employee table with a primary key, a foreign key constraint reference to the customer tables primary key, representing an employee of the customer.
A work item table, which stores work done for the customer, and also info about the specific employee who the work was performed for.
My question is. How do I, on a database level, test if an employee is actually associated with a customer, when adding new work items.
If for example Scott (employee) works at Microsoft (customer), and Jeff (employee) works at StackOverflow (customer), how do I prevent somebody from adding a work item into the database, with customer = Microsoft, and employee = Jeff, which do not make sense?
Can I do it with check constraints or foreign keys or do I need a trigger to test for it manually?
Should mention that I use SQL Server 2008.
UPDATE: I should add that WorkItem.EmployeeId can be null.
Thanks, Egil.

Wouldn't a foreign key on a composite column (CustomerId, EmployeeId) work?
ALTER TABLE WorkItem
ADD CONSTRAINT FK_Customer_Employee FOREIGN KEY (CustomerId, EmployeeId)
REFERENCES Employee (CustomerId, EmployeeId);

You might be able to do this by creating a view "WITH SCHEMABINDING" that spans those tables and enforces the collective constraints of the individual tables.

Why do you want employeeId to be null int WorkItem? Maybe you should add another table to avoid that particular oddity. From what I can see the easiest thing to do is to add a unique constraint on employeeid in workItem, and maybe even unique on customerId if that is what you want.
A more general way to add constraints spanning many tables is to define a view that should always be empty, and add the constraint that it is empty.

What are you trying to model here?
You're a contracting agency or the like, and you have a bunch of contractors who are (for some period of time) assigned to a customer.
You're actually storing information about other company's employees (maybe you're providing outsources payroll services, for example).
In case (1), it looks like you have a problem with the Employee table. In particular, when Scott's contract with MS is up and he gets contracted to someone else, you can't keep the historical data, because you need to change the CustomerId. Which also invalidates all the WorkItems. Instead, you should have a fourth table, e.g., CustomerEmployee to store that. Then WorkItem should reference that table.
In case (2), your primary key on Employee should really be CustomerId, EmployeeId. Two customers could have the same employee ID number. Then Kieron's foreign key will work.

I recently pass to a similar situation, consider the schema:
Table company (id_cia PK) Table product_group (id_cia FK to company, id_group PK) Table products (id_group FK to product_group, id_product PK, id_used_by_the_client null)
Rule: The database must allow only one id_used_by_the_client for each product of a company but this filed can be null. Example:
Insert into company (1) = allowed
Insert into company (2) = allowed
Insert into product_group (1, 1) = allowed
Insert into product_group (1,2) = allowed
Insert into product_group (2,3) = allowed
Insert into products values (1, 1, null) = allowed
Insert into products values (1, 2, null) = allowed
Insert into products values (1, 3, 1) = allowed
Insert into products values (1, 4, 1) = not allowed, in the group 1 that belongs to company 1 already exists an id_used_by_the_client = 1.
Insert into products values (2, 4, 1) = not allowed, in the group 2 that belongs to company 1 already exists an id_used_by_the_client = 1.
Insert into products values (3, 4, 1) = allowed, in the group 3 that belongs to company 2 there is no id_used_by_the_client = 1.
I decided to use a trigger to control this integrity.

Either:
make the EmployeeID column the Primary Key of Employee (and possibly an auto-id) and store the EmployeeID in the WorkItem record as a foreign key, instead of storing the Employee and Customer IDs in WorkItem. You can retrieve a WorkItem's Customer details by joining to the Customer table via the Employee table.
Or:
make the WorkItem's EmployeeID and CustomerID columns a composite foreign key to Employee.
I favour the first approach, personally.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas