Constraint is not enough? - sql

I have a very simple Database consist of;
Person_id, City and is_manager columns. is_manager can have TRUE or FALSE bit (0,1).
There are some restrictions that I manage to complete, such as;
each city can have one or more than one person (unique
person_id+City)
each person can be either manager or not in a city. (unique person+city+is_manager)
each city can have only one manager
but can have more than one non-managers.
not every person has to have a city, some might not be assigned to a
city at all.
I managed to do first two constraints easily but I couldn't manage for the third condition. because each city can have one manager, but can have more than one non_manager positions.
I have some other methodology to solve it but it will be not 100% on server-side solution. I want to have 100% SQL Server side solution.
I think it can be done with trigger;
if so, how? (when is_manager TRUE comes, search whole database for that if city has any manager before, if not accept data, otherwise don't accept)
if not necessary, how with constraints?

A sample DB
create table Person(
id int primary key,
cityId int,
constraint UK1 unique(cityId, id)
);
create table City (
id int primary key,
managerId int,
constraint FK1 foreign key(id, managerId) references Person(cityId, id)
);
Business process is first assign a person to a City then make him/her a manager of the City.
db<>fiddle including test data

Related

Benefits of using an autogenerated primary key instead of a constant unique name

I've heard that having an autogenerated primary key is a convention. However, I'm trying to understand its benefits in the following scenario:
CREATE TABLE countries
(
countryID int(11) PRIMARY KEY AUTO_INCREMENT,
countryName varchar(128) NOT NULL
);
CREATE TABLE students
(
studentID int(11) PRIMARY KEY AUTO_INCREMENT,
studentName varchar(128) NOT NULL,
countryOfOrigin int(11) NOT NULL,
FOREIGN KEY (countryOfOrigin) REFERENCES countries (countryID)
);
INSERT INTO countries (countryName)
VALUES ('Germany'), ('Sweden'), ('Italy'), ('China');
If I want to insert something into the students table, I need to lookup the countryIDs in the countries table:
INSERT INTO students (studentName, countryOfOrigin)
VALUES ('Benjamin Schmidt', (SELECT countryID FROM countries WHERE countryName = 'Germany')),
('Erik Jakobsson', (SELECT countryID FROM countries WHERE countryName = 'Sweden')),
('Manuel Verdi', (SELECT countryID FROM countries WHERE countryName = 'Italy')),
('Min Lin', (SELECT countryID FROM countries WHERE countryName = 'China'));
In a different scenario, as I know that the countryNames in the countries table are unique and not null, I could to the following:
CREATE TABLE countries2
(
countryName varchar(128) PRIMARY KEY
);
CREATE TABLE students2
(
studentID int(11) PRIMARY KEY AUTO_INCREMENT,
studentName varchar(128) NOT NULL,
countryOfOrigin varchar(128) NOT NULL,
FOREIGN KEY (countryOfOrigin) REFERENCES countries2 (countryName)
);
INSERT INTO countries2 (countryName)
VALUES ('Germany'), ('Sweden'), ('Italy'), ('China');
Now, inserting data into the students2 table is simpler:
INSERT INTO students2 (studentName, countryOfOrigin)
VALUES ('Benjamin Schmidt', 'Germany'),
('Erik Jakobsson', 'Sweden'),
('Manuel Verdi', 'Italy'),
('Min Lin', 'China');
So why should the first option be the better one, given that countryNames are unique and are never going to change?
There are two apects involved here:
natural keys vs. surrogate keys
autoincremented values
You are wondering why to have to deal with some arbitrary number for a country, when a country can be uniquely identified by its name. Well, imagine you use the country names in several tables to relate rows to each other. Then at some point you are told that you misspelled a country. You want to correct this, but have to do this in every table the country occurs in. In big databases you usually don't have cascading updates in order to avoid updates that unexpectedly take hours instead of mere minutes or seconds. So you must do this manually, but the foreign key constraints get in your way. You cannot change the parent table's key, because there are child tables using this, and you cannot change the key in the child tables first, because that key has to exist in the parent table. You'll have to work with a new row in the parent table and start from there. Quite some task. And even if you have no spelling issue, at some point someone might say "we need the official country names; you have China, but it must be the People's Republic of China instead" and again you must look up and change that contry in all those tables. And what about partial backups? A table gets totally messed up due to some programming error and must be replaced by last week's backup, because this is the best you have. But suddenly some keys don't match any more. You never want a table's key to change.
You say "country names are unique and are never going to change". Think again :-)
It is easier to have your database use a technical arbitrary ID instead. Then the country name only exists in the country table. And if that name must get changed, you change it just in that one place, and all relations stay intact. This, however, doesn't mean that natural keys are worse than technical IDs. They are not. But it's more difficult with them to set up a database correctly. In case of countries a good natural key would be a country ISO code, because this uniquely identifies a country and doesn't change. This would be my choice here.
With students it's the same. Students usually have a student number or student ID in real world, so you can simply use this number to uniquely identifiy a student in the database. But then, how do we get these unique student IDs? At a large university, two secretaries may want to enter new students at the same time. They ask the system what the last student's ID was. It was #11223, so they both want to issue #11224, which causes a conflict of course, because only one student can be given that number. In order to avoid this, DBMS offer sequences of which numbers are taken. Thus one of the secretaries pulls #11224 and the other #11225. Auto-incremented IDs work this way. Both secretaries enter their new student, the rows get inserted into the student table and result in the two different IDs that get reported back to the secretaries. This makes sequences and auto incrementing IDs a great and safe tool to work with.
Convention can be a useful guide. It isn't necessarily the best option in all situations.
There are usually tradeoffs involved, like space, convenience, etc.
While you showed one method of resolving / inserting the proper country key value, there's a slightly less wordy option supported by standard SQL (and many databases).
INSERT INTO students (studentName, countryOfOrigin)
WITH list (name, country) AS (
SELECT *
FROM (
VALUES ('Benjamin Schmidt', 'Germany')
, ('Erik Jakobsson', 'Sweden')
, ('Manuel Verdi', 'Italy')
, ('Min Lin', 'China')
) AS x
)
SELECT name, countryID
FROM list AS l
JOIN countries AS c
ON c.countryName = l.country
;
and a little less wordy again:
INSERT INTO students (studentName, countryOfOrigin)
WITH list (name, country) AS (
VALUES ('Benjamin Schmidt', 'Germany')
, ('Erik Jakobsson', 'Sweden')
, ('Manuel Verdi', 'Italy')
, ('Min Lin', 'China')
)
SELECT name, countryID
FROM list AS l
JOIN countries AS c
ON c.countryName = l.country
;
Here's a test case with MariaDB 10.5:
Working test case (updated)

Is this a good database design practice?

I got a Person table, each Person can visit several countries. The countries visited by each Person is stored in table CountryVisit
Person:
PersonId,
Name
CountryVisit:
CountryVisitId (primary key)
PersonId (foreign key to 'Person.PersonId')
CountryName
VisitDate
For the CountryVisit Table, my primary key is CountryVisitId which is an identity column. This design will result in that a Person can have only 1 CountryVisit but the CountryVisitId can be 40 for example. Is it a better practice to create another surrogate key column to act as an identity column while the CountryVisitId be a natural key that is unique for each PersonId ?
It is pretty good. I would suggest that you have a separate table for countries, with one row per country. Then the CountryVisits table would have:
CountryVisitId PrimaryKey,
PersonId ForeignKey,
CountryId ForeignKey,
VisitDate
This will ensure that the country name is always spelled correctly and consistently. If you want a list of countries to get started, check out this Wikipedia page. Also note that your definition of country may be different from the standard list of countries (there are actually several out there), so you should use your own auto-incremented primary key, rather than using the country code.
And, you should relax the requirement and remove the unique or primary key on PersonId, CountryId, unless you want to enforce only one visit per country.

Implementing complex references without a trigger in PostgreSQL

I am creating a PostgreSQL database: Country - Province - City.
A city must belong to a country and can belong to a province.
A province must belong to a country.
A city can be capital of a country:
CREATE TABLE country (
id serial NOT NULL PRIMARY KEY,
name varchar(100) NOT NULL
);
CREATE TABLE province (
id serial NOT NULL PRIMARY KEY,
name varchar(100) NOT NULL,
country_id integer NOT NULL,
CONSTRAINT fk_province_country FOREIGN KEY (country_id) REFERENCES country(id)
);
CREATE TABLE city (
id serial NOT NULL PRIMARY KEY,
name varchar(100) NOT NULL,
province_id integer,
country_id integer,
CONSTRAINT ck_city_provinceid_xor_countryid
CHECK ((province_id is null and country_id is not null) or
(province_id is not null and country_id is null)),
CONSTRAINT fk_city_province FOREIGN KEY (province_id) REFERENCES province(id),
CONSTRAINT fk_city_country FOREIGN KEY (country_id) REFERENCES country(id)
);
CREATE TABLE public.capital (
country_id integer NOT NULL,
city_id integer NOT NULL,
CONSTRAINT pk_capital PRIMARY KEY (country_id, city_id),
CONSTRAINT fk_capital_country FOREIGN KEY (country_id) REFERENCES country(id),
CONSTRAINT fk_capital_city FOREIGN KEY (city_id) REFERENCES city(id)
);
For some (but not all) countries I will have province data, so a city will belong to a province, and the province to a country. For the rest, I shall just know that the city belongs to a country.
Issue #1: Concerning the countries that I do have province data, I was looking for a solution that will disallow a city to belong to a country and at the same time to a province of a different country.
I preferred to enforce through a check constraint that either province or country (but NOT both) are not null in city. Looks like a neat solution.
The alternative would be to keep both province and country info within the city and enforce consistency through a trigger.
Issue #2: I want to disallow that a city is a capital to a country to which it does not belong. That seems impossible without a trigger after my solution to issue #1 because there is no way to directly reference the country a city belongs to.
Maybe the alternative solution to issue #1 is better, it also simplifies future querying.
I would radically simplify your design:
CREATE TABLE country (
country_id serial PRIMARY KEY -- pk is not null automatically
,country text NOT NULL -- just use text
,capital int REFERENCES city -- simplified
);
CREATE TABLE province ( -- never use "id" as name
province_id serial PRIMARY KEY
,province text NOT NULL -- never use "name" as name
,country_id integer NOT NULL REFERENCES country -- references pk per default
);
CREATE TABLE city (
city_id serial PRIMARY KEY
,city text NOT NULL
,province_id integer NOT NULL REFERENCES province,
);
Since a country can only have one capitol, no n:m table is needed.
Never use "name" or "id" as column names. That's an anti-pattern of some ORMs. Once you join a couple of tables (which you do a lot in relational databases) you end up with multiple columns of the same non-descriptive name, causing all kinds of problems.
Just use text. No point in varchar(n). Avoid problem like this.
The PRIMARY KEY clause makes a column NOT NULL automatically. (NOT NULL sticks, even if you later remove the pk constraint.)
And most importantly:
A city only references one province in all cases. No direct reference to country. Therefore mismatches are impossible, on-disk storage is smaller and your whole design is much simpler. Queries are simpler.
For every country enter a single dummy-province with an empty string as name (''), representing the country "as a whole". (Possibly even with the same id, you could have provinces and countries draw from the same sequence ...). Do this automatically in a trigger. This trigger is optional, though.
I chose an empty string instead of NULL, so the column can still be NOT NULL and a unique index over (country_id, province) does its job. You can easily identify this province representing the whole country and deal with it as appropriate in your application.
I am using a similar design successfully in multiple instances.
I think you can actually implement all these constraints without using triggers. It does require a bit of restructuring of the data.
Start by enforcing the relationship (using foreign keys) of:
city --> province --> country
For countries with no province information, invent a province -- perhaps with the country name, perhaps some weird default name ("CountryProvince"). This allows you to have only one set of relationship between the three entities. It automatically ensures that cities and provinces are in the right country, because you would get the country through the province.
The final question is about capitals. There is a way that you can implement and enforce uniqueness with no triggers. Keep a flag in the cities table and use a unique filtered index to guarantee uniqueness:
create unique index on cities_capitalflag on cities(capitalflag) where capitalflag = 'Y';
EDIT:
You are right about the the filtered index needing the country. But that requires storing the country in that table, which, in turn, requires keeping the provinces and cities aligned with respect to country. So, this solution gets close to not needing triggers but it isn't there.

Designing contact/company/address tables for a database

Trying to design part of a database to hold addresses, companies and contacts. I had one design of it in which I've now got the job of 'cleaning' it due to poor design.
Bought a copy of Joe Celko's SQL Programmer Style for reference as I'm coming from a programming angle so I ended up with...
Addresses
street_1_adr varchar(80) primary key
street_2_adr varchar(80)
street_3_adr varchar(80)
zip_code varchar(10) foreign key/primary key > Regions.zip_code
With a check to ensure all addresses are unique to prevent duplicates.
Regions
city varchar(80)
region varchar(80)
zip_code varchar(10) primary key
country_nbr integer foreign key/primary key > Countries.country_nbr
With a check to ensure all regions are unique to prevent duplicates.
Countries
country_nbr integer primary key
country_nm varchar(80)
country_code char(3)
With a check to ensure that only one record exists for all the information.
Companies
company_nm varchar(80) primary key
street_1_adr varchar(80) foreign key > Addresses.street_1_adr
zip_code varchar(10) foreign key > Addresses.zip_code
Extra information
With a check to ensure that only one company with that name can exist at the address specified
Contacts
company_nm varchar(80) primary key/foreign key > Companies.company_nm
first_nm varchar(80) primary key
last_nm varchar(80) primary key
Extra information
But this means that if I want to hook, as an example, an order onto a contact I need to do it with three fields.
Does this look right or have I completetly missed the point?
Firstly, I recommend using integer values for your primary keys
(if using mysql auto_increment is a handy feature, too)
When using your PK (primary key) as a FK (foreign key) in an different table, use the same datatype and don't save names.
You seem to save the company_name in "Contacts" even though you could simply save the ID of the company and get the name via a join-select.
IN your case it is OK, since the name is the primary key (varchar), but what happens when you get the same company name twice (eg Mc Donalds has more than one location)
ERP systems deploy those kind of structures mostly as or near as:
company (id and name)
site (id, name, FK company, additional information like address)
address (mostly referenced directly in site and sometime part of site)
region + country (all of them are "basic" data and referenced by ID in address table)
company table mostly only saves the ID and Name of an company.
site table (with foreign key relation to company) gives the "company" its adresses, legal information, etc.
A couple of thoughts:
First of all, a zip code can represent multiple cities/towns in the same state. Also, one city can have multiple zip codes.
Usually you to not find an address table separate from the entity. In other words, your company table should carry the full address.
The primary keys for the tables are usually unique identifiers or auto-increment numbers separate from the actual names. That way, if a company or contact changes it's name, or a typo was entered and corrected, you do not need to cascade the change to other tables.
You may want to future proof your design by allowing for many addresses and contacts to be added to a company. What you would do is create a many to many relationship by using a junction table (http://en.wikipedia.org/wiki/Junction_table)
Company
--------------
CompanyID (PK)
...
Address
--------------
AddressID (PK)
...
CompanyAddress
--------------
CompanyID (PK)
AddressID (PK)
The CompanyAddress table will allow you to have multiple addresses for each company. You can also do the same for contacts, depending if the contact is associated with the company or the address. Below is another link that talks about how to create the many to many relationship.
http://www.tomjewett.com/dbdesign/dbdesign.php?page=manymany.php

How to design relation between tables employee,client and phone Number?

I have a relational database with a Client table, containing id, name, and address, with many phone numbers
and I have an Employee table, also containing id, name, address, etc., and also with many phone numbers.
Is it more logical to create one "Phone Number" table and link the Clients and Employees, or to create two separate "Phone Number" tables, one for Clients and one for Employees?
If I choose to create one table, can I use one foreign key for both the Client and Employee or do I have to make two foreign keys?
If I choose to make one foreign key, will I have to make the Client ids start at 1 and increment by 5, and Employee ids start at 2 and increment by 5 so the two ids will not be the same?
If I create two foreign keys will one have a value and the other allow nulls?
The solution which I would go with would be:
CREATE TABLE Employees (
employee_id INT NOT NULL,
first_name VARCHAR(30) NOT NULL,
...
CONSTRAINT PK_Employees PRIMARY KEY (employee_id)
)
CREATE TABLE Customers (
customer_id INT NOT NULL,
customer_name VARCHAR(50) NOT NULL,
...
CONSTRAINT PK_Customers PRIMARY KEY (customer_id)
)
-- This is basic, only supports U.S. numbers, and would need to be changed to
-- support international phone numbers
CREATE TABLE Phone_Numbers (
phone_number_id INT NOT NULL,
area_code CHAR(3) NOT NULL,
prefix CHAR(3) NOT NULL,
line_number CHAR(4) NOT NULL,
extension VARCHAR(10) NULL,
CONSTRAINT PK_Phone_Numbers PRIMARY KEY (phone_number_id),
CONSTRAINT UI_Phone_Numbers UNIQUE (area_code, prefix, line_number, extension)
)
CREATE TABLE Employee_Phone_Numbers (
employee_id INT NOT NULL,
phone_number_id INT NOT NULL,
CONSTRAINT PK_Employee_Phone_Numbers PRIMARY KEY (employee_id, phone_number_id)
)
CREATE TABLE Customer_Phone_Numbers (
customer_id INT NOT NULL,
phone_number_id INT NOT NULL,
CONSTRAINT PK_Customer_Phone_Numbers PRIMARY KEY (customer_id, phone_number_id)
)
Of course, the model might changed based on a lot of different things. Can an employee also be a customer? If two employees share a phone number how will you handle it on the front end when the phone number for one employee is changed? Will it change the number for the other employee as well? Warn the user and ask what they want to do?
Those last few questions don't necessarily affect how the data is ultimately modeled, but will certainly affect how the front-end is coded and what kind of stored procedures you might need to support it.
"The Right Way", allowing you to use foreign keys for everything, would be to have a fourth table phoneNumberOwner(id) and have fields client.phoneNumberOwnerId and employee.phoneNumberOwnerId; thus, each client and each employee has its own record in the phoneNumberOwner table. Then, your phoneNumbers table becomes (phoneNumberOwnerId, phoneNumber), allowing you to attach multiple phone numbers to each phoneNumberOwner record.
Maybe you can somehow justify it, but to my way of thinking it is not logical to have employees and clients in the same table. It seems you wan to do this only so that your foreign keys (in the telephone-number table) all point to the same table. This is not a good reason for combining employees and clients.
Use three tables: employees, clients, and telephone-number. In the telephone table, you can have a field that indicates employee or client. As an aside, I don't see why telephone number needs to be a foreign key: that only adds complexity with very little benefit, imo.
Unless there are special business requirements I would expect a telephone number to be an attribute of an employee or client entity and not an entity in its own right.
If it were considered an entity in its own right it would be 'all key' i.e. its identifier is the compound of its attributes and has no attributes other than its identifier. If the sub-attributes aren't stored apart then it only has one attribute i.e. the telephone number itself! Therefore, it isn't usually 'interesting' enough to be an entity in its own right and a telephone numbers table, whether superclass or subclass, is usually overkill (as I say, barring special business requirements).