Designing contact/company/address tables for a database - sql

Trying to design part of a database to hold addresses, companies and contacts. I had one design of it in which I've now got the job of 'cleaning' it due to poor design.
Bought a copy of Joe Celko's SQL Programmer Style for reference as I'm coming from a programming angle so I ended up with...
Addresses
street_1_adr varchar(80) primary key
street_2_adr varchar(80)
street_3_adr varchar(80)
zip_code varchar(10) foreign key/primary key > Regions.zip_code
With a check to ensure all addresses are unique to prevent duplicates.
Regions
city varchar(80)
region varchar(80)
zip_code varchar(10) primary key
country_nbr integer foreign key/primary key > Countries.country_nbr
With a check to ensure all regions are unique to prevent duplicates.
Countries
country_nbr integer primary key
country_nm varchar(80)
country_code char(3)
With a check to ensure that only one record exists for all the information.
Companies
company_nm varchar(80) primary key
street_1_adr varchar(80) foreign key > Addresses.street_1_adr
zip_code varchar(10) foreign key > Addresses.zip_code
Extra information
With a check to ensure that only one company with that name can exist at the address specified
Contacts
company_nm varchar(80) primary key/foreign key > Companies.company_nm
first_nm varchar(80) primary key
last_nm varchar(80) primary key
Extra information
But this means that if I want to hook, as an example, an order onto a contact I need to do it with three fields.
Does this look right or have I completetly missed the point?

Firstly, I recommend using integer values for your primary keys
(if using mysql auto_increment is a handy feature, too)
When using your PK (primary key) as a FK (foreign key) in an different table, use the same datatype and don't save names.
You seem to save the company_name in "Contacts" even though you could simply save the ID of the company and get the name via a join-select.
IN your case it is OK, since the name is the primary key (varchar), but what happens when you get the same company name twice (eg Mc Donalds has more than one location)
ERP systems deploy those kind of structures mostly as or near as:
company (id and name)
site (id, name, FK company, additional information like address)
address (mostly referenced directly in site and sometime part of site)
region + country (all of them are "basic" data and referenced by ID in address table)
company table mostly only saves the ID and Name of an company.
site table (with foreign key relation to company) gives the "company" its adresses, legal information, etc.

A couple of thoughts:
First of all, a zip code can represent multiple cities/towns in the same state. Also, one city can have multiple zip codes.
Usually you to not find an address table separate from the entity. In other words, your company table should carry the full address.
The primary keys for the tables are usually unique identifiers or auto-increment numbers separate from the actual names. That way, if a company or contact changes it's name, or a typo was entered and corrected, you do not need to cascade the change to other tables.

You may want to future proof your design by allowing for many addresses and contacts to be added to a company. What you would do is create a many to many relationship by using a junction table (http://en.wikipedia.org/wiki/Junction_table)
Company
--------------
CompanyID (PK)
...
Address
--------------
AddressID (PK)
...
CompanyAddress
--------------
CompanyID (PK)
AddressID (PK)
The CompanyAddress table will allow you to have multiple addresses for each company. You can also do the same for contacts, depending if the contact is associated with the company or the address. Below is another link that talks about how to create the many to many relationship.
http://www.tomjewett.com/dbdesign/dbdesign.php?page=manymany.php

Related

Can you make a foreign key from two tables?

So, I have three tables (mail,country and client). Table COUNTRY is parent table for table MAIL, so primary key od table MAIL is postcode from MAIL and countrycode from COUNTRY. I need to add a foreign key in table CLIENT that is made of postcode and countrycode but I want postcode in that foreign key to come from COUNTRY not MAIL. I only know how to make foreign key in CLIENT from MAIL:
CREATE TABLE COUNTRY (countrycode char(3),
CONSTRAINT pk_country PRIMARY KEY(countrycode))
CREATE TABLE MAIL (postcode numeric(5), countrycode char(3),
FOREIGN KEY(countrycode) REFERENCES country(countrycode),
CONSTRAINT pk_mail PRIMARY KEY(postcode,countrycode))
CREATE TABLE CLIENT (OIB numeric(11), postcode numeric(5), countrycode char(3),
FOREIGN KEY(postcode,countrycode) REFERENCES mail(postcode,countrycode),
CONSTRAINT pk_client PRIMARY KEY(OIB))
BUT I don't want that. I want my countrycode in table CLIENT to come from COUNTRY not from MAIL.
I've tryed:
FOREIGN KEY(postcode,countrycode) REFERENCES mail(postcode) AND country(countrycode)
but it doesn't work.
Is there a way to do that?
Don't use compound primary keys. Don't replicate data across multiple tables. So, look up the country based on the post code. I would suggest:
CREATE TABLE COUNTRIES (
countryid int identity(1, 1) primary key,
countrycode char(3) unique
);
CREATE TABLE PostCodes (
postcodeid int identity(1, 1) primary key,
countryid int references countries(countryid),
postcode nvarchar(25),
unique (countryid, postcode)
);
CREATE TABLE CLIENTS (
clientId int identity(1, 1) primary key,
OIB numeric(11) unique,
postcodeid int references postcodes(postcodeid)
);
Note some things:
You can look up the country via the postal code.
Postal codes are not always numeric.
Countries change over time (e.g. East Timor, South Sudan).
Postal codes can change (e.g. 10021 on the Upper East Side of Manhattan split into 10065 and 10075, once upon a time).
No, you can't have a single foreign key reference multiple tables, but you don't need to: You can have multiple foreign keys per table, so you can just reference both.
CREATE TABLE CLIENT (
OIB numeric(11),
postcode numeric(5),
countrycode char(3),
FOREIGN KEY(postcode,countrycode) REFERENCES mail(postcode,countrycode),
FOREIGN KEY(countrycode) REFERENCES country(countrycode),
CONSTRAINT pk_client PRIMARY KEY(OIB)
)
Other posts have presented a number of arguments regarding your existing table structures, some (imho) more valid than others. My reply is based solely on your existing structures, and does not attempt to second guess whether they are “right” or “wrong”. (Yes, I have opinions on the topic, but that’s not what you were asking about.)
On the face of it, there’s no need for the FK on column CountryCode in CLIENT to go all the way to COUNTRY to “validate itself”. The FK in MAIL ensures that any CountryCode in MAIL will also be in COUNTRY; so, an FK in CLIENT to MAIL on CountryCode is just as good at validating CountryCode as an FK directly to Country.
The once exception I can see to this is if you need to have a CLIENT with a valid CountryCode, but without a valid MAIL entry. (One thing unclear from your model is whether the columns are NULLable or not.) This might be the case if a CLIENT must have a COUNTRY, but does not have to have a MAIL. To do this, I’d use two FKs: one to MAIL on both columns, and one to COUNTRY on CountryCode… with column PostalCode set to allow NULLs. This way, CountryCode will always be validated against COUNTRY, and CountryCode + PostalCode will always be validated against MAIL--but only when PostalCode is NOT NULL.
Again, whether this is “right” or “wrong” architecture ultimately depends on what problems you are trying to solve—business, performance, storage (volume), and so on.

Is this a good database design practice?

I got a Person table, each Person can visit several countries. The countries visited by each Person is stored in table CountryVisit
Person:
PersonId,
Name
CountryVisit:
CountryVisitId (primary key)
PersonId (foreign key to 'Person.PersonId')
CountryName
VisitDate
For the CountryVisit Table, my primary key is CountryVisitId which is an identity column. This design will result in that a Person can have only 1 CountryVisit but the CountryVisitId can be 40 for example. Is it a better practice to create another surrogate key column to act as an identity column while the CountryVisitId be a natural key that is unique for each PersonId ?
It is pretty good. I would suggest that you have a separate table for countries, with one row per country. Then the CountryVisits table would have:
CountryVisitId PrimaryKey,
PersonId ForeignKey,
CountryId ForeignKey,
VisitDate
This will ensure that the country name is always spelled correctly and consistently. If you want a list of countries to get started, check out this Wikipedia page. Also note that your definition of country may be different from the standard list of countries (there are actually several out there), so you should use your own auto-incremented primary key, rather than using the country code.
And, you should relax the requirement and remove the unique or primary key on PersonId, CountryId, unless you want to enforce only one visit per country.

One Address Table for Many entities?

Conceptual stage question:
I have several Tables (Person, Institution, Factory) each has many kinds of Addresses (Mailing, Physical)
Is there a way to create a single Address table that contains all the addresses of all the Entities?
I'd rather not have a PersonAddress and FactoryAddress etc set of tables.
Is there another option?
The amount of data will only be several thousand addresses at most, so light in impact.
My proposal relies on the principle that one entity (person, Institution, Factory, etc) can have multiple adresses, which is usually the case (home, business, etc), and that one adress can be shared by entities of different nature:
CREATE TABLE ADDRESS
(
ID INT IDENTITY PRIMARY KEY NOT NULL,
.... (your adress fields here)
id_Person ... NULL,
id_Institution ... NULL,
id_Factory ... NULL
)
The main limit is that 2 different persons cannot share the same adress. In such a situation, you'll have to go with an additional "EntityAddress" table, like this:
CREATE TABLE ADDRESS
(
ID INT IDENTITY PRIMARY KEY NOT NULL,
.... (your adress fields here)
)
CREATE TABLE ENTITY_ADDRESS
(
ID INT IDENTITY PRIMARY KEY NOT NULL
id_Address .... NOT NULL,
id_Person .... NULL,
id_Institution ... NULL,
id_Factory .... NULL
)
The last model allows you to share for example one adress for multiple persons working in the same institution.
BUT: according to me, the 'better' solution would be to merge your different entities into one table. You will then need:
An Entity Table, made for all entities
An Entity Type table, that will contain the different entity types.
In your case you have at least 3 rows: persons, factories,
institution
If one adress per entity is enough, you could go for the address details as properties of the Entity table.
If you need multiple addresses by entity, you'll have to go with the Addresses Table with an Id_Entity as a foreign key.
If you want to share one adress among multiple entities, each entity having potentially multiple adresses (a many-to-many relation between entities and adresses), then you will need to go for the EntityAddres table in addition to the Entity and Address Tables.
Your choice between these models will depend on your needs and your businness rules.
You need to use abstraction and inheritance.
An individual and institution (I'd call it organization) are really just concrete representations of an abstract legal party.
A mailing or physical address is the concretion of an abstract address, which could also be an email address, telephone number, or web address.
A legal party can be have zero or more addresses.
An address can be belong to zero or more legal parties.
A party could use the same address for multiple roles, such as 'Home' address and 'Work' address.
If a factory is big enough, sub-facilities in the factory might have their own addresses, so you might want to consider a hierarchical relationship there. For example, each apartment in a condo has one address each. Each building in a large factory might have their own address.
create table party (
party_id identity primary key
);
create table individual (
individual_id int primary key references party(party_id),
...
);
create table organization (
organization_id int primary key references party(party_id),
...
);
create table address (
address_id identity primary key,
...
);
create table mailing_address (
address_id int primary key references address(address_id),
...
);
create table party_address (
party_id int references party(party_id),
address_id int references address(address_id),
role varchar(255), --this should really point to a role table
primary key (party_id, address_id, role)
);
create table facility (
facility_id identity primary key,
address_id int not null references address(address_id),
parent_id int null references facility(facility_id)
);
in my opinion ,you should create a pivot table to link Entity with her Address
for exampleinstitution_addresses(id, id_institution,id_address), person_addresses(id,id_person,id_address) etc...
You could very definitely do this. You could have the Address table that has an ID, then Person, Institution and Factory could all have foreign keys to the Address table.
If you need to be able to distinguish what kind of Address it is at the Address level, you could consider adding an AddressType table and having a foreign key to that on the Address table
Example:
CREATE TABLE ADDRESS
(
ID INT IDENTITY PRIMARY KEY NOT NULL,
City VARCHAR(50) NOT NULL,
State VARCHAR(2) NOT NULL,
Zip VARCHAR(10) NOT NULL,
AddressLine1 VARCHAR(200) NOT NULL,
AddressLine2 VARCHAR(200) NOT NULL,
)
CREATE TABLE Person
(
ID INT IDENTITY PRIMARY KEY NOT NULL,
AddressID INT FOREIGN KEY REFERENCES Address(ID)
)
CREATE TABLE Institution
(
ID INT IDENTITY PRIMARY KEY NOT NULL,
AddressID INT FOREIGN KEY REFERENCES Address(ID)
)
...etc
Another basic \ bullet proof system would be to organise your model around:
An Entity Table, made for all entities
An Entity Type table, that will contain the different entity types. In your case you have at least 3 rows: persons, factories, institution
If one adress per entity is enough, you could go for the address details as properties of the Entity table.
If you need multiple addresses by entity, you'll have to go with the Addresses Table with an Id_Entity as a foreign key.
If you want to share one adress among multiple entities, each entity having potentially multiple adresses (a many-to-many relation between entities and adresses), then you will need to go for the EntityAddres table in addition to the Entity and Address Tables.
EDIT: this answer was also merged with the other answer I gave here ... so I do not know if it deserves an upvote!

How to design relation between tables employee,client and phone Number?

I have a relational database with a Client table, containing id, name, and address, with many phone numbers
and I have an Employee table, also containing id, name, address, etc., and also with many phone numbers.
Is it more logical to create one "Phone Number" table and link the Clients and Employees, or to create two separate "Phone Number" tables, one for Clients and one for Employees?
If I choose to create one table, can I use one foreign key for both the Client and Employee or do I have to make two foreign keys?
If I choose to make one foreign key, will I have to make the Client ids start at 1 and increment by 5, and Employee ids start at 2 and increment by 5 so the two ids will not be the same?
If I create two foreign keys will one have a value and the other allow nulls?
The solution which I would go with would be:
CREATE TABLE Employees (
employee_id INT NOT NULL,
first_name VARCHAR(30) NOT NULL,
...
CONSTRAINT PK_Employees PRIMARY KEY (employee_id)
)
CREATE TABLE Customers (
customer_id INT NOT NULL,
customer_name VARCHAR(50) NOT NULL,
...
CONSTRAINT PK_Customers PRIMARY KEY (customer_id)
)
-- This is basic, only supports U.S. numbers, and would need to be changed to
-- support international phone numbers
CREATE TABLE Phone_Numbers (
phone_number_id INT NOT NULL,
area_code CHAR(3) NOT NULL,
prefix CHAR(3) NOT NULL,
line_number CHAR(4) NOT NULL,
extension VARCHAR(10) NULL,
CONSTRAINT PK_Phone_Numbers PRIMARY KEY (phone_number_id),
CONSTRAINT UI_Phone_Numbers UNIQUE (area_code, prefix, line_number, extension)
)
CREATE TABLE Employee_Phone_Numbers (
employee_id INT NOT NULL,
phone_number_id INT NOT NULL,
CONSTRAINT PK_Employee_Phone_Numbers PRIMARY KEY (employee_id, phone_number_id)
)
CREATE TABLE Customer_Phone_Numbers (
customer_id INT NOT NULL,
phone_number_id INT NOT NULL,
CONSTRAINT PK_Customer_Phone_Numbers PRIMARY KEY (customer_id, phone_number_id)
)
Of course, the model might changed based on a lot of different things. Can an employee also be a customer? If two employees share a phone number how will you handle it on the front end when the phone number for one employee is changed? Will it change the number for the other employee as well? Warn the user and ask what they want to do?
Those last few questions don't necessarily affect how the data is ultimately modeled, but will certainly affect how the front-end is coded and what kind of stored procedures you might need to support it.
"The Right Way", allowing you to use foreign keys for everything, would be to have a fourth table phoneNumberOwner(id) and have fields client.phoneNumberOwnerId and employee.phoneNumberOwnerId; thus, each client and each employee has its own record in the phoneNumberOwner table. Then, your phoneNumbers table becomes (phoneNumberOwnerId, phoneNumber), allowing you to attach multiple phone numbers to each phoneNumberOwner record.
Maybe you can somehow justify it, but to my way of thinking it is not logical to have employees and clients in the same table. It seems you wan to do this only so that your foreign keys (in the telephone-number table) all point to the same table. This is not a good reason for combining employees and clients.
Use three tables: employees, clients, and telephone-number. In the telephone table, you can have a field that indicates employee or client. As an aside, I don't see why telephone number needs to be a foreign key: that only adds complexity with very little benefit, imo.
Unless there are special business requirements I would expect a telephone number to be an attribute of an employee or client entity and not an entity in its own right.
If it were considered an entity in its own right it would be 'all key' i.e. its identifier is the compound of its attributes and has no attributes other than its identifier. If the sub-attributes aren't stored apart then it only has one attribute i.e. the telephone number itself! Therefore, it isn't usually 'interesting' enough to be an entity in its own right and a telephone numbers table, whether superclass or subclass, is usually overkill (as I say, barring special business requirements).

How to enforce uniques across multiple tables

I have the following tables in MySQL server:
Companies:
- UID (unique)
- NAME
- other relevant data
Offices:
- UID (unique)
- CompanyID
- ExternalID
- other data
Employees:
- UID (unique)
- OfficeID
- ExternalID
- other data
In each one of them the UID is unique identifier, created by the database.
There are foreign keys to ensure the links between Employee -> Office -> Company on the UID.
The ExternalID fields in Offices and Employees is the ID provided to my application by the Company (my client(s) actually). The clients does not have (and do not care) about my own IDs, and all the data my application receives from them is identified solely based on their IDs (i.e. ExternalID in my tables).
I.e. a request from the client in pseudo-language is like "I'm Company X, update the data for my employee Y".
I need to enforce uniqueness on the combination of CompanyID and Employees.ExternalID, so in my database there will be no duplicate ExternalID for the employees of the same company.
I was thinking about 3 possible solutions:
Change the schema for Employees to include CompanyID, and create unique constrain on the two fields.
Enforce a trigger, which upon update/insert in Employees validates the uniqueness.
Enforce the check on application level (i.e. my receiving service).
My alternative-dbadmin-in-me sais that (3) is the worst solution, as it does not protect the database of inconsistency in case of application bug or something else, and most probably will be the slowest one.
The trigger solution may be what I want, but it may become complicated, especially if a multiple inserts/updates need to be performed in a single statement, and I'm not sure about the performance vs. (1).
And (1) looks the fastest and easiest approach, but kind of goes against my understanding of relational model.
What SO DB experts opinion is about pros and cons of each of the approaches, especially if there is a possibility for adding an additional level of indirection - i.e. Company -> Office -> Department -> Employee, and the same uniqueness needs to be preserved (Company/Employee).
You're right - #1 is the best option.
Granted, I would question it at first glance (because of shortcutting) but knowing the business rule to ensure an employee is only related to one company - it makes sense.
Additionally, I'd have a foreign key relating the companyid in the employee table to the companyid in the office table. Otherwise, you allow an employee to be related to a company without an office. Unless that is acceptable...
Triggers are a last resort if the relationship can not be demonstrated in the data model, and servicing the logic from the application means the logic is centralized - there's no opportunity for bad data to occur, unless someone drops constraints (which means you have bigger problems).
Each of your company-provided tables should include CompanyID into the `UNIQUE KEY' over the company-provided ids.
Company-provided referential integrity should use company-provided ids:
CREATE TABLE company (
uid INT NOT NULL PRIMARY KEY,
name TEXT
);
CREATE TABLE office (
uid INT NOT NULL PRIMARY KEY,
companyID INT NOT NULL,
externalID INT NOT NULL,
UNIQIE KEY (companyID, externalID),
FOREIGN KEY (companyID) REFERENCES company (uid)
);
CREATE TABLE employee (
uid INT NOT NULL PRIMARY KEY,
companyID INT NOT NULL,
officeID INT NOT NULL,
externalID INT NOT NULL,
UNIQIE KEY (companyID, externalID),
FOREIGN KEY (companyID) REFERENCES company(uid)
FOREIGN KEY (companyID, officeID) REFERENCES office (companyID, externalID)
);
etc.
Set auto_increment_increment to the number of table you have.
SET auto_increment_increment = 3; (you might want to set this in your my.cnf)
Then manually set the starting auto_increment value of each table to different values
first table to 1, second table to 2, third table to 3
Table 1 will have values like 1,4,7,10,13,etc
Table 2 will have values like 2,5,8,11,14,etc
Table 3 will have values like 3,6,9,12,15,etc
Of course this is just ONE option, personally I'd just make it a combo value. Could be as simple as TableID, AutoincrementID, Where the TableID is constant in all rows.