How to Set Customer Table with Multiple Phone Numbers? - Relational Database Design - sql

CREATE TABLE Phone
(
phoneID - PK
.
.
.
);
CREATE TABLE PhoneDetail
(
phoneDetailID - PK
phoneID - FK points to Phone
phoneTypeID ...
phoneNumber ...
.
.
.
);
CREATE TABLE Customer
(
customerID - PK
firstName
phoneID - Unique FK points to Phone
.
.
.
);
A customer can have multiple phone numbers e.g. Cell, Work, etc.
phoneID in Customer table is unique and points to PhoneID in Phone table.
If customer record is deleted, phoneID in Phone table should also be deleted.
Do you have any concerns on my design? Is this designed properly? My problem is
phoneID in Customer table is a child and if child record is deleted then i
can not delete the parent (Phone) record automatically.

I think you've overdesigned it. I see no use for a separate Phone + PhoneDetail table. Typically there are two practical approaches.
1) Simplicity -Put all of the phones in the Customer record itself. Yes, it breaks normalization rules, but its very simple in practice and usually works as long as you provide (Work, Home, Mobile, Fax, Emergency). Upside is code is simply to write, time to implementation is shorter. Retrieving all the phones with a customer record is simple, and so is using a specific type of phone (Customer.Fax).
The downsides : adding additional phone types later is a little more painful, and searching for phone numbers is kludgy. You have to write SQL like "select * from customer where cell = ? or home = ? or work = ? or emergency = ?". Assess your design up front. If either of these issues is a concern, or you don't know if it may be a concern, go with the normalized approach.
2) Extensibility - Go the route you are going. Phone types can be added later, no DDL changes. Customer -> CustomerPhone
Customer (
customerId
)
CustomerPhone (
customerId references Customer(customerId)
phoneType references PhoneTypes(phoneTypeId)
phoneNumber
)
PhoneTypes (
phoneTypeId (H, W, M, F, etc.)
phoneTypeDescription
)

As mrjoltcola already addressed the normalization, I'll tackle the problem of having a record in phone and no record in phone detail.
If that is your only problem there are three approaches:
1) do not delete from detail table but from phone with CASCADE DELETE - gives a delete from two tables with single SQL statement and keeps data consistent
2) have triggers on the detail table that will delete the parent automatically when last record for a parent is deleted from the child (this will not perform well and will slow down all deletes on the table. and it is ugly. still it is possible to do it)
3) do it in the business logic layer of the application - if this layer is properly separated and if users(applications) will be modifying data only through this layer you might reach desired level of consistency guarantee

Related

How to invert a an incorrectly modeled 1:1 relationship?

I have an incorrectly modeled 1:1 relationship between two tables:
Table: Customer
* id (bigint)
* ...
Table: Address
* id
* customer_id (bigint) <--- FOREIGN KEY
* street (varchar)
* ...
The real-world relationship is so that a customer may have one address or not. However, with the current data model, it would be possible to assign multiple addresses to a customer. We do not do this at the moment, so the data could be migrated to this:
Table: Customer
* id (bigint)
* address_id (nullable bigint)
* ...
Is it possible to make this migration in one transaction, using purely SQL code? I would like to avoid an intermediate state where we have both relationships and migrate the customers one-by one. That is the best idea I came up with so far.
What I understood so far is you want to add address_id column in customer table. If there is more than one address for a customer then you might need to select only one address. Here I am considering last address.
update Customer
set address_id=(select max(id) from Address a where a.customer_id=Customer.id)
You can actually leave your data model as is and just add a unique constraint to address:
alter table address unq_address_customer unique (customer_id);
This is not ideal, but it does enforce the unique constraint with minimal changes to the data model.
That said, I would question why you want only one address per customer. Have you considered these situations?
Customers whose "delivery" address and whose "billing" address are different.
Customers who move.
Customers whose address changes although they do not move, say due to postal code reassignments or street name changes.

Database design for customer to skills

I have an issue where we have a customer table includes name, email, address and a skills table which is qts, first aid which is associated by an id. For example
Customer
id = 1
Name = James
Address = some address
Skills
1, qts
2, first aid
I am now trying to pair up the relationship. I first came to a quick solution just by creating a skills table which just has customerId and each skill has a true / false value. Then created a go between customer_skills with an customerId to SkillId. But I would not know how to update the records when values change as there is no unique id.
can anyone help on what would be the best way to do this?
thanks....
The solution you want really depends on your data, and is a question that has been asked thousands of times before. If you google
Entity Attribute Value vs strict relational model you will see countless articles
comparing and contrasting the methods available.
Strict Relational Model
You would add additional BIT or DATETIME fields (Where a NULL datetime represents the customer not having the skill)
to your customer table for each skill. This works well if you have few skills that are unlikely to change much over time.
This allows simple queries to locate customers with skills, especially with various combinations of skills ee.g (Using datetime fields)
SELECT *
FROM Customer
WHERE Skill1 >= '20120101' -- SKILL 1 AQUIRED AFTER 1ST JAN 2012
AND Skill2 IS NOT NULL -- HAS SKILL 2
AND Skill2 IS NULL -- DOES NOT POSSESS SKILL 3
Entity-Attribute-Value Model
This is a slight adaptation of a classic entity-attribute-value model, because the value is boolean represented by the existence of a record.
You would create a table like this:
CREATE TABLE CustomerSkills
( CustomerID INT NOT NULL,
SkillID INT NOT NULL
PRIMARY KEY (CustomerID, SkillID),
FOREIGN KEY (CustomerID) REFERENCES Customer (ID),
FOREIGN KEY (SkillID) REFERENCES Skills (ID)
)
You may want additional columns such as DateAdded, AddedBy etc to track when skills were added and who by etc, but the core principles can be gathered from the above.
With this method it is much easier to add skills, as it doesn't require adding columns, but can make simple queries much more complicated. The above query would have to be written as:
SELECT Customer.*
FROM Customer
INNER JOIN
( SELECT CustomerID
FROM CustomerSkills
WHERE SkillID IN (2, 3) -- SKILL2,SKILL3
OR (SkillID = 1 AND DateAdded >= '20120101')
GROUP BY CustomerID
HAVING COUNT(*) = 2
AND COUNT(CASE WHEN SkillID = 3 THEN 1 END) = 0
) skills
ON Skills.CustomerID = Customer.ID
This is much more complext and resource intensive than with the relational model, but the overall structure is much more flexible.
So to summarise, it really depends on your own particular situation, there are a few factors to consider, but there are plenty of resources out there to help you decide.
If you have a table linking the primary keys from two other tables together in order to form a many-to-many relationship (like in you example) you don't have to update that table. Instead you can just delete and reinsert values into it.
If you are editing a custiomer (customerId 46 for instance) and changing the skills for that customer, you can just delete all skills for the customer and then reinsert the new set of skills when storing the changes.
If your "link table" contains some additional information besides just the two primary key columns, then the situation might be different. But from your description it seems like you just want to link the table together using the primary keys from each table. In that case a delete + reinsert should be fine.
Also in this kind of table, you should make the combination of the two foreign key fields be the primary key of the binding table.

Create foreign key with non unique column in SQL Server

I am dealing with a table that contains both cars and owners (table CO). I am creating another table to contain attributes for an owner (table OwnerAttributes), that a user can assign to through a GUI. My problem lies in the fact that owners are not unique and since I am using SQL Server I cannot create a foreign key on them. There is an id in the table, but it identifies the car and owner as a whole.
The idea I had to get around this problem is to create a new table (table Owners) that contains distinct owners, and then adding a trigger to table CO that would update the Owners with any changes. I can then use table Owners for my OwnerAttributes table and solve my problem.
The question I want answered is if there is a better way to do this?
I am using a preexisting database, that is heavily used by an old application. The application is hooked up to use the table CO for owners and cars. There also exists several other tables that use the CO table. I wish I could split the table into Owners and Cars, but the company doesn't want me to spend all my time doing it as there are several more features I need to add to the application.
Your thoughts on the Owners table are on the right track! Your problem is because your schema is not normalized. It's the fact you're storing two things (cars, and owners) in one table (your table CO).
You are correct that you should make an Owner table, but you should then remove the Owner information from the CO table entirely, and replace it with a foreign key to the Owners table.
So you want something like this:
CREATE TABLE Owner (
ownerID int not null primary key indentity(1,0),
FirstName varchar(255),
LastName varchar(255),
/* other fields here */
)
GO
CREATE TABLE Car
carID int not null primary key identity(1,0),
ownerID int not null references Owner(ownerID),
/* other fields go here */
GO
/* a convenience, read only view to replace your old CAR OWNER table */
CREATE VIEW Car_Owner AS
SELECT c.*, o.FirstName, o.LastName FROM Car c INNER JOIN Owner o ON c.ownerID = o.ownerID
Now, you have everything properly normalized in SQL. A view has given you back the car_owner as one thing in a pseudo-table.
But the real answer is, normalize your schema. Let SQL do what it does best (relate things to other things). Combining the two things on one table will just lead to more problems like you're encountering downstream.
Hopefully this answer seems helpful and not condescending, which is what I was going for! I have learned the hard way that this approach (normalize everything, let the database do some extra work to retrieve/display/insert it) is the only one that works out in the end.
You should create Owner table, Car table, OwnerCar table(if person can has a few cars). Owner table contains fields, that describe owner(owner properties)

Auto increment with a Unit Of Work

Context
I'm building a persistence layer to abstract different types of databases that I'll be needing. On the relational part I have mySQL, Oracle and PostgreSQL.
Let's take the following simplified MySQL tables:
CREATE TABLE Contact (
ID varchar(15),
NAME varchar(30)
);
CREATE TABLE Address (
ID varchar(15),
CONTACT_ID varchar(15),
NAME varchar(50)
);
I use code to generate system specific alpha numeric unique ID's fitting 15 chars in this case. Thus, if I insert a Contact record with it's Addresses I have my generated Contact.ID and Address.CONTACT_IDs before committing.
I've created a Unit of Work (amongst others) as per Martin Fowler's patterns to add transaction support. I'm using a key based Identity Map in the UoW to track the changed records in memory. It works like a charm for the scenario above, all pretty standard stuff so far.
The question scenario comes in when I have a database that is not under my control and the ID fields are auto-increment (or in Oracle sequences). In this case I do not have the db generated Contact.ID beforehand, so when I create my Address I do not have a value for Address.CONTACT_ID. The transaction has not been started on the DB session since all is kept in the Identity Map in memory.
Question: What is a good approach to address this? (Avoiding unnecessary db round trips)
Some ideas:
Retrieve the last ID: I can do a call to the database to retrieve the last Id like:
SELECT Auto_increment FROM information_schema.tables WHERE table_name='Contact';
But this is MySQL specific and probably something similar can be done for the other databases. If do this then would need to do the 1st insert, get the ID and then update the children (Address.CONTACT_IDs) – all in the current transaction context.
Avoid explicitly referencing the CONTACT_ID entirely. Assuming that Contact.NAME has a UNIQUE constraint and that the CONTACT_ID column REFERENCES Contact(ID):
INSERT INTO Contact (NAME) VALUES ('Joe Bloggs'); -- Contact.ID auto-generated
INSERT INTO Address (CONTACT_ID, NAME)
VALUES ((SELECT ID FROM Contact WHERE NAME = 'Joe Bloggs'),
'123 Apple Lane');
Now Address.CONTACT_ID is correct without your code knowing the key's value or even its type.

How to enforce DB integrity with non-unique foreign keys?

I want to have a database table that keeps data with revision history (like pages on Wikipedia). I thought that a good idea would be to have two columns that identify the row: (name, version). So a sample table would look like this:
TABLE PERSONS:
id: int,
name: varchar(30),
version: int,
... // some data assigned to that person.
So if users want to update person's data, they don't make an UPDATE -- instead, they create a new PERSONS row with the same name but different version value. Data shown to the user (for given name) is the one with highest version.
I have a second table, say, DOGS, that references persons in PERSONS table:
TABLE DOGS:
id: int,
name: varchar(30),
owner_name: varchar(30),
...
Obviously, owner_name is a reference to PERSONS.name, but I cannot declare it as a Foreign Key (in MS SQL Server), because PERSONS.name is not unique!
Question: How, then, in MS SQL Server 2008, should I ensure database integrity (i.e., that for each DOG, there exists at least one row in PERSONS such that its PERSON.name == DOG.owner_name)?
I'm looking for the most elegant solution -- I know I could use triggers on PERSONS table, but this is not as declarative and elegant as I want it to be. Any ideas?
Additional Information
The design above has the following advantage that if I need to, I can "remember" a person's current id (or (name, version) pair) and I'm sure that data in that row will never be changed. This is important e.g. if I put this person's data as part of a document that is then printed and in 5 years someone might want to print a copy of it exactly unchanged (e.g. with the same data as today), then this will be very easy for them to do.
Maybe you can think of a completely different design that achieves the same purpose and its integrity can be enforced easier (preferably with foreign keys or other constraints)?
Edit: Thanks to Michael Gattuso's answer, I discovered another way this relationship can be described. There are two solutions, which I posted as answers. Please vote which one you like better.
In your parent table, create a unique constraint on (id, version). Add version column to your child table, and use a check constraint to make sure that it is always 0. Use a FK constraint to map (parentid, version) to your parent table.
Alternatively you could maintain a person history table for the data that has historic value. This way you keep your Persons and Dogs table tidy and the references simple but also have access to the historically interesting information.
Okay, first thing is that you need to normalize your tables. Google "database normalization" and you'll come up with plenty of reading. The PERSONS table, in particular, needs attention.
Second thing is that when you're creating foreign key references, 99.999% of the time you want to reference an ID (numeric) value. I.e., [DOGS].[owner] should be a reference to [PERSONS].[id].
Edit: Adding an example schema (forgive the loose syntax). I'm assuming each dog has only a single owner. This is one way to implement Person history. All columns are not-null.
Persons Table:
int Id
varchar(30) name
...
PersonHistory Table:
int Id
int PersonId (foreign key to Persons.Id)
int Version (auto-increment)
varchar(30) name
...
Dogs Table:
int Id
int OwnerId (foreign key to Persons.Id)
varchar(30) name
...
The latest version of the data would be stored in the Persons table directly, with older data stored in the PersonHistory table.
I would use and association table to link the many versions to the one pk.
A project I have worked on addressed a similar problem. It was a biological records database where species names can change over time as new research improved understanding of taxonomy.
However old records needed to remain related to the original species names. It got complicated but the basic solution was to have a NAME table that just contained all unique species names, a species table that represented actual species and a NAME_VERSION table that linked the two together. At any one time there would be a preferred name (ie the currently accepted scientific name for the species) which was a boolean field held in name_version.
In your example this would translate to a Details table (detailsid, otherdetails columns) a link table called DetailsVersion (detailsid, personid) and a Person Table (personid, non-changing data). Relate dogs to Person.
Persons
id (int),
name,
.....
activeVersion (this will be UID from personVersionInfo)
note: Above table will have 1 row for each person. will have original info with which person was created.
PersonVersionInfo
UID (unique identifier to identify person + version),
id (int),
name,
.....
versionId (this will be generated for each person)
Dogs
DogID,
DogName
......
PersonsWithDogs
UID,
DogID
EDIT: You will have to join PersonWithDogs, PersionVersionInfo, Dogs to get the full picture (as of today). This kind of structure will help you link a Dog to the Owner (with a specific version).
In case the Person's info changes and you wish to have latest info associated with the Dog, you will have to Update PersonWithDogs table to have the required UID (of the person) for the given Dog.
You can have restrictions such as DogID should be unique in PersonWithDogs.
And in this structure, a UID (person) can have many Dogs.
Your scenarios (what can change/restrictions etc) will help in designing the schema better.
Thanks to Michael Gattuso's answer, I discovered another way this relationship can be described. There are two solutions, this is the first of them. Please vote which one you like better.
Solution 1
In PERSONS table, we leave only the name (unique identifier) and a link to current person's data:
TABLE PERSONS:
name: varchar(30),
current_data_id: int
We create a new table, PERSONS_DATA, that contains all data history for that person:
TABLE PERSONS_DATA:
id: int
version: int (auto-generated)
... // some data, like address, etc.
DOGS table stays the same, it still points to a person's name (FK to PERSONS table).
ADVANTAGE: for each dog, there exists at least one PERSONS_DATA row that contains data of its owner (that's what I wanted)
DISADVANTAGE: if you want to change a person's data, you have to:
add a new PERSONS_DATA row
update PERSONS entry for this person to point to the new PERSONS_DATA row.
Thanks to Michael Gattuso's answer, I discovered another way this relationship can be described. There are two solutions, this is the second of them. Please vote which one you like better.
Solution 2
In PERSONS table, we leave only the name (unique identifier) and a link to the first (not current!) person's data:
TABLE PERSONS:
name: varchar(30),
first_data_id: int
We create a new table, PERSONS_DATA, that contains all data history for that person:
TABLE PERSONS_DATA:
id: int
name: varchar(30)
version: int (auto-generated)
... // some data, like address, etc.
DOGS table stays the same, it still points to a person's name (FK to PERSONS table).
ADVANTAGES:
for each dog, there exists at least one PERSONS_DATA row that contains data of its owner (that's what I wanted)
if I want to change a person's data, I don't have to update the PERSONS row, only add a new PERSONS_DATA row
DISADVANTAGE: to retrieve current person's data, I have to either:
choose PERSONS_DATA with given name and highest version (may be expensive)
choose PERSONS_DATA with special version, e.g. "-1", but then I would have to update two PERSONS_DATA rows each time I add new PERSONS_DATA, and in this solution I wanted to avoid having to update 2 rows...
What do you think?