Constraint to table column name (postgresql) - sql

I'm implementing a product database using the single table inheritance (potentially later class table inheritance) model for product attributes. That's all working fine and well but I'm trying to figure out how best to deal with product variants whilst maintaining referential integrity.
Right now a simplified version of my main product table looks like this:
CREATE TABLE product (
id SERIAL NOT NULL,
name VARCHAR(100) NOT NULL,
brand VARCHAR(40) NOT NULL,
color VARCHAR(40)[] NOT NULL
)
(color is an array so that all of the standard colors of any given product can be listed)
For handling variants I've considered tracking the properties on which products vary in a table called product_variant_theme:
CREATE TABLE product_variant_theme (
id SERIAL NOT NULL,
product_id INT NOT NULL,
attribute_name VARCHAR(40) NOT NULL
)
Wherein I insert rows with the product_id in question and add the column name for the attribute into the attribute_name field e.g. 'color'.
Now feel free to tell me if this is an entirely stupid way to go about this in the first place, but I am concerned by the lack of a constraint between attribute_name and the actual column name itself. Obviously if alter the product table and remove that column I might still be left with rows in my second table that refer to it. The functional equivalent of what I'm looking for would be something like a foreign key on attribute_name to the information_schema view that describes the tables, but I don't think there's any way to do that directly, and I'm wondering if there is any reasonable way to get that kind of functionality here.
Thanks.

Are you looking for something like this?
product
=======
id
name
attribute
=========
id
name
product_attribute_map
=====================
product_id
attribute_id
value

Related

PosgreSQL - Ensuring that at least one row exists in Table B for each row in Table A

Currently we have a products table which is fairly straightforward, the relevant part of the structure is something like this:
id SERIAL PRIMARY KEY,
title text NOT NULL,
description text NOT NULL,
[...]
We now need to support an arbitrary number of languages for the title and description of each product, and the default language can vary from product to product (some sites may be multilingual from page to page).
So far, nothing too difficult - add a product_metadata table something like this:
product_id int NOT NULL REFERENCES products(id),
language_code_id int NOT NULL REFERENCES language_codes(id),
title text NOT NULL,
description text NOT NULL,
[...]
CONSTRAINT product_metadata_pkey PRIMARY KEY (product_id, language_code_id)
It seems like the next logical step is to move the existing title and description data into the new table and remove those columns from products, but this means that new rows in products can be added without a title or description.
Using id SERIAL PRIMARY KEY in product_metadata (and replacing the existing composite primary key with a unique constraint) and adding a default_metadata_id int NOT NULL REFERENCES product_metadata(id) column to products would ensure at least one metadata row per product, but it creates a loop between the tables.
It looks like using a deferrable constraint would accommodate this as long as the insert queries were written to insert into both tables before committing, but creating a deliberate cycle and relying on this kind of behaviour seems... messy. Is there a neater way to achieve the same thing, or is this one of those cases where that really is the right way to go?

How to design entity tables for entities with multiple names

I want to create a table structure to store customers and I am facing a challenge: for each customer I can have multiple names, one being the primary one and the others being the alternative names.
The initial take on the tables looks like this:
CREATE TABLE dbo.Customer (
CustomerId INT IDENTITY(1,1) NOT NULL --PK
-- other fields below )
CREATE TABLE dbo.CustomerName (
CustomerNameId INT IDENTITY(1,1) NOT NULL -- PK
,CustomerId INT -- FK to Customer
,CustomerName VARCHAR(30)
,IsPrimaryName BIT)
Though, the name of the customer is part of the Customer entity and I feel that it belongs to the Customer table.
Is there a better design for this situation?
Thank you
Personally, I would keep the Primary name in the Customer table and create an "AlternateNames" table with a zero-to-many relationship to Customer.
This is because presumably most of the time when you are returning customer data, you are only going to be interested in returning the Primary Name. And probably the main (if not only) reason you want the alternate names is for looking up customers when an alternate name has been supplied.
Unfortunately, this is too long for a comment.
Before figuring this out, more information is needed.
Is additional information needed for names? For instance, language or title or date created?
Are the names unique? Is the uniqueness within a customer or over all names?
Are the primary names unique?
Does every customer have to have a primary name?
How often does the primary name change to an alternate name? (As opposed to just having the name updated.)
When querying the data, will you know if the name is a primary or alternate name? (Or do they all need to be compared?)
Depending on the answer to this question, the appropriate data structure can have some tricky nuances. For instance, if you have a flag to identify the primary name, it can be tricky to ensure that exactly one row has this value set -- particularly when updating rows.
Note: If you update the question with the answers, I'll delete this.

How do I realize this 3 simple things in SQL?

i am an absolute SQL beginner and i already did search a lot with google but didnt find what i needed. So how do i realize in SQL(translated from a ER-Model):
An Entity having an Atribute that can have mulitple Entrys(I already found the ARRAY contstraint but i am unsure about that)
An Entity having an Atribute that consists itself of a few more Atributes(Picture: http://static3.creately.com/blog/wp-content/uploads/2012/03/Attributes-ER-Diagrams.jpeg)
Something like a isA Relation. And especially the total/partial and the disjunct characteristics.
Thanks already
In Postgres you have all three of this:
create table one
(
id integer primary key,
tags text[] -- can store multiple tags in a single column
);
A single column with multiple attributes can be done through a record type:
create type address as (number integer, street varchar(100), city varchar(100));
create table customer
(
id integer primary key,
name varchar(100) not null,
billing_address address
);
An isA relation can be done using inheritance
create table paying_customer
(
paid_at timestamp not null,
paid_amount decimal(14,2) not null
)
inherits (customer);
A paying customer has all attributes of a customer plus the time when the invoice was paid.

Multi valued column in SQL

I created a table for Boat that contains the following columns: BName, Type, Price, OName.
However, the Type column should be one of the following: Sailboat, Houseboat, or Deckboat.
How can I reflect this on the create table statement. I've searched about it and I came up with this statement which I'm not sure if it's right or not:
CREATE TABLE Boat
(
BName varchar(255),
BType int,
Price double,
OName varchar(255),
PRIMARY KEY (BName),
FOREIGN KEY (BType) REFERENCES BoType(ID)
);
CREATE TABLE BoType
(
ID int PRIMARY KEY,
Type varchar(255)
)
Is this the best way to do it?
You can try something like this:
mycol VARCHAR(10) NOT NULL CHECK (mycol IN('moe', 'curley', 'larry'))
Here are more details on MSSQL "Check Constraints":
http://technet.microsoft.com/en-us/library/ms188258%28v=sql.105%29.aspx
That's the best way to do it just make sure that you populate the BoType table with the desired reference values (i.e. Sailboat, Houseboat, Deckboat). Because if you use constraint then you have the user of the database who have no knowledge in SQL or have no access rights in your DB, at your mercy or they become too dependent on you. However, if you set it as a separate table then the user of your system/database even without knowledge about SQL could add or change values via your front-end program (e.g. ASP, PHP). In other words you design is more flexible and scalable not to mention less maintenance in your part.

SQL One-to-One Relationship Definition

I'm designing a database and I'm not sure how to define one of the relationships. Here's the situation:
An invoice is created
If the product is not in stock then it needs to be manufactured and so a work order is created.
The relationship is one-to-one. However work orders are sometimes created for other purposes so the WorkOrder table will also be linked to other tables in a similar one-to-one relationship. Also, some Invoices won't have a work order at all. This means I can't define these relationships in the normal way by using the same primary key in both tables. Instead of doing this I've created a linking table and then set unique indexes on both fields to define the one-to-one relationship (see image).
(source: markevans.org)
.
Is this the best way?
Cheers
Mark
EDIT: I just realised that this design will allow a single work order to be linked to an invoice and also to one of the other tables I mentioned via 2 linking tables. I guess no solution is perfect.
Okay, this answer is SQL Server specific, but should be adaptable to other RDBMSs, with a little work. So far as I see, we have the following constraints:
An invoice may be associated with 0 or 1 Work Orders
A Work Order must be associated with an invoice or an ABC or a DEF
I'd design the WorkOrder table as follows:
CREATE TABLE WorkOrder (
WorkOrderID int IDENTITY(1,1) not null,
/* Other Columns */
InvoiceID int null,
ABCID int null,
DEFID int null,
/* Etc for other possible links */
constraint PK_WorkOrder PRIMARY KEY (WorkOrderID),
constraint FK_WorkOrder_Invoices FOREIGN KEY (InvoiceID) references Invoice (InvoiceID),
constraint FK_WorkOrder_ABC FOREIGN KEY (ABCID) references ABC (ABCID),
/* Etc for other FKs */
constraint CK_WorkOrders_SingleFK CHECK (
CASE WHEN InvoiceID is null THEN 0 ELSE 1 END +
CASE WHEN ABCID is null THEN 0 ELSE 1 END +
CASE WHEN DEFID is null THEN 0 ELSE 1 END
/* + other FK columns */
= 1
)
)
So, basically, this table is constrained to only FK to one other table, no matter how many PKs are defined. If necessary, a computed column could tell you the "Type" of item that this is linked to, based on which FK column is non-null, or the type and a single int column could be real columns, and InvoiceID, ABCID, etc could be computed columns.
The final thing to ensure is that an invoice only has 0 or 1 Work Orders. If your RDMBS ignores nulls in unique constraints, this is as simple as applying such a constraint to each FK column. For SQL Server, you need to use a filtered index (>=2008) or an indexed view (<=2005). I'll just show the filtered index:
CREATE UNIQUE INDEX IX_WorkItems_UniqueInvoices on
WorkItem (InvoiceID) where (InvoiceID is not null)
Another way to deal with keeping WorkOrders straight is to include a WorkOrder type column in WorkOrder (e.g. 'Invoice','ABC','DEF'), including a computed or column constrained by check constraint to contain the matching value in the link table, and introduce a second foreign key:
CREATE TABLE WorkOrder (
WorkOrderID int IDENTITY(1,1) not null,
Type varchar(10) not null,
constraint PK_WorkOrder PRIMARY KEY (WorkOrderID),
constraint UQ_WorkOrder_TypeCheck UNIQUE (WorkOrderID,Type),
constraint CK_WorkOrder_Types CHECK (Type in ('INVOICE','ABC','DEF'))
)
CREATE TABLE Invoice_WorkOrder (
InvoiceID int not null,
WorkOrderID int not null,
Type varchar(10) not null default 'INVOICE',
constraint PK_Invoice_WorkOrder PRIMARY KEY (InvoiceID),
constraint UQ_Invoice_WorkOrder_OrderIDs UNIQUE (WorkOrderID),
constraint FK_Invoice_WorkOrder_Invoice FOREIGN KEY (InvoiceID) references Invoice (InvoiceID),
constraint FK_Invoice_WorkOrder_WorkOrder FOREIGN KEY (WorkOrderID) references WorkOrder (WorkOrderID),
constraint FK_Invoice_WorkOrder_TypeCheck FOREIGN KEY (WorkOrderID,Type) references WorkOrder (WorkOrderID,Type),
constraint CK_Invoice_WorkOrder_Type CHECK (Type = 'INVOICE')
)
The only disadvantage to this model, although closer to your original proposal, is that you can have a work order that isn't actually linked to any other item (although it claims to be for an e.g INVOICE).
What you have looks to be a perfectly normal way to construct your tables.
If you think you might like to use only one link table between your WorkOrder table and whatever other tables that may have WorkOrders, you could use a link table like:
WorkOrders
OtherId (Could be InvoiceId, or an ID for SomethingElse that may have a WorkOrder)
OtherType (ENUM - something like 'Invoice', 'SomethingElse')
WorkOrderId
So the issue is that you can have invoices that don't have work orders and work orders that don't have invoices but the two need to be linked when there is a link. I would say based upon that description that your database diagram is pretty good. This would open you up to allowing more than a one-to-one relationship. This way down the road you can consider having two work orders for one invoice. You might also have one work order that handles two invoices. This opens you up to a lot of possibilities that you may not need now but that you might in the future.
I would recommend your current design. In the future, you may want to add more information about the link between invoice and work order. This middle table will allow you to add this information.
In the interest of fairness to the other side of the coin, you do need to consider speed/number of tables/etc. that this will cause. For example, you have now created a third table which increased your table count by 50% in this example. Look at the rest of your database. If you did this everywhere, you would probably have the most normalized database but it might not be the most performant because of all the joins that are necessary. Basically, this isn't a "one-size-fits-all" solution. Instead it is a design choice. Personally, I hate nullable foreign key fields. I find they don't give me the granularity I usually want with my database designs.
Your schema corresponds to a many-to-many link between the 2 tables. You are de facto opening here the possibility to have one work order for multiple invoices, and multiple work orders for one invoice. The model offers then possibilities far above the rules you are setting.
You could use a simpler schema, that will reflect the (0,1) relation between work orders and invoices, and the (0,1) relation between Invoices and Work orders:
a Work Order can be independant from
an invoice, or linked to one specific
invoice: it has a (0,1) relation to Invoice table
An invoice can have no work orders, or one work orders: it has a (0,1) relation to Work Orders Table
Such a relation can be translated by the following model and rules
Invoice
id_Invoice, Primary Key
WorkOrder
id_WorkOrder, Primary Key
id_Invoice, Foreign Key, Nulls accepted, unique value
With such a structure, it will be easy to add new 'dependants' to work orders table. If, for example, you want to open the possibility to launch work orders from restocking orders (where you want to have minimal quantities of some items in stock), you can then just add the corresponding field to the WorkOrder table:
id_RestockingOrder, ForeignKey, Nulls accepted, unique value
You'll be then able to 'see' from where your WorkOrder comes: an invoice, a restocking order, etc.
Seems it corresponds to your needs.
Edit:
as noted by #mark, SQL Server will not allow multiple null values, in contradiction with ANSI specs (check here for some more details), As we do not want to wait for SQL Server 2011 to have this rule implemented, there is a workaround here, where you can build a view excluding the null values and set a unique index on this view. I must admit that I did not like this solution ...
There is still the possibility to implement the 'unique if not null' rule in your code. It will still be simpler than implementing the many-to-many model (with the Invoice_WorkOrder table) you are proposing and manage all additional unicity rules that you'll need to implement.
There is no real need for the link table, just have them linked directly and allow for NULL in the reference field of the work order. Because a work order can be linked to multiple tables what I would do is add a reference id on every work order to every table that can link from it. So you would have:
Invoice
PK - ID
FK - WorkOrderID
SomeOtherTable
PK - ID
FK - WorkOrderID
WorkOrder
PK - ID
FK - InvoiceID (allow NULL)
FK - SomeOtherTableID (allow NULL)
To make sure a WorkOrder is linked to only one item, you have to use code to validate the row (or perhaps a stored procedure which I cannot come up with right now).
EDIT: PS, if you want to use a link table, give it a generic name and add all the linked tables with the same sort of construct I just described allowing for NULL's. In my eyes adding the extra table makes the schema larger than it needs to be, but if a work order contains a lot of big text fields it could increase performance slightly and reduce database size with all the indexes flying around. In anything but the largest applications, I would consider it over-normalization though, but that is a matter of style.