Help me normalize my simple book catalog schema - sql

This is a simple database model for an online library catalog. I am trying to normalize it, if possible. What do you think I should change or do differently?
For example, I am not sure about the table authors. It has only one column "name" which is also a primary key and I use it also as a foreign key in another table. Is that a good practice? Also should I put two columns there ("first_name" and "last_name") instead of just one?
CREATE TABLE books (
isbn VARCHAR2(13) NOT NULL PRIMARY KEY,
title VARCHAR2(200),
summary VARCHAR2(2000),
date_published DATE,
page_count NUMBER
);
CREATE TABLE authors (
name VARCHAR2(200) NOT NULL PRIMARY KEY
);
CREATE TABLE books_authors_xref (
author_name VARCHAR2(200),
book_isbn VARCHAR2(13),
CONSTRAINT pk_books_authors_xref PRIMARY KEY (author_name, book_isbn),
CONSTRAINT fk_books_authors_xref1 FOREIGN KEY (author_name) REFERENCES authors (name),
CONSTRAINT fk_books_authors_xref2 FOREIGN KEY (book_isbn) REFERENCES books (isbn)
);
CREATE TABLE book_copies (
barcode_id VARCHAR2(100) NOT NULL PRIMARY KEY,
book_isbn VARCHAR2(13),
CONSTRAINT fk_book_copies FOREIGN KEY (book_isbn) REFERENCES books (isbn)
);

It's reasonably normalized. I'd add a numeric "author_id" to the authors table and use that instead of author_name in the books_authors_xref table and use that for the relationships, which lets you do things like deal with two authors with the same name, and change how you store the name later without making a mess. :-)

I think all four tables are in 5NF. What do you think?
But . . .
Author names aren't unique. Adding an ID number to the authors table identifies the row, but it doesn't identify the author. For example, assume there are two authors with the name "Richard Knop". You can't enter both into your existing table, because there's a primary key constraint on author names. If you try to fix that by adding an ID number, you might end up with this.
author_id author_name
--
1 Knop, Richard
2 Knop, Richard
Which one of those is you? How do you know?

In addition to using an "author_id" as mentioned by Christo and Catcall, you might also consider using a "book"id" for the primary key on your book table. Not all things published/printed have an ISBN-- either because the book predates ISBN or it was printed by someone that didn't think it needed an ISBN (such as a lot of the training material that I've seen over the years).

I try this and it simply works... I make little bit change and table is created. Moreover,
ISBN is used as foreign key.
CREATE TABLE book_copies
( barcode_id VARCHAR2(100) NOT NULL PRIMARY KEY,
ISBN VARCHAR2(13),
CONSTRAINT FK_BOOK_COPIES FOREIGN KEY (ISBN )REFERENCES books (isbn))

Related

How to check for not null value in relationship table in PostgreSQL?

I have these three tables in a PostgreSQL db and as you can see books has an author field (referencing users) and users has a company_id field referencing companies.
create table companies (
id serial primary key,
name varchar not null
);
create table users (
id serial primary key,
email varchar not null,
company_id int,
constraint fk_company
foreign key(company_id)
references companies(id)
);
create table books (
id serial primary key,
title varchar not null,
author_id integer not null,
constraint fk_author
foreign key(author_id)
references users(id)
);
A user may or may not belong to a company, but to create a book the user must have a company reference.
I am wondering if there is a way to implement a CHECK constraint on the author_id column which would ensure that the author has a company reference.
Maybe something like: author_id integer not null CHECK(users.company_id is not null), but of course this doesn't work.
Is there some sort of constraint I can use to check columns in relations?
Also am I even approaching this problem in the right way?
Thanks in advance. :)
Looking at constraints:
A Publisher can have zero or more publications.
An Author/User can have zero or more books Authored.
A Book will have one Publisher & one or more Authors.
Move the company_id constraint to books table. If & when an User/Author[s] publishes a book; They become a new Publishing Company and an entry to Publishing Company.

SQL Server Different ways of creating table with foreign key

I need to create table with a foreign key. So far I have been doing that like this:
CREATE TABLE books
(
book_id NVARCHAR(15) NOT NULL UNIQUE,
author_id INT REFERENCES authors(author_id)
...
);
But my professor from university sent me exemplary scripts showing another way of dealing with foreign keys:
CREATE TABLE books
(
book_id NVARCHAR(15) NOT NULL UNIQUE,
author_id INT,
CONSTRAINT author_FK
FOREIGN KEY(author_id) REFERENCES authors(author_id)
...
);
Trying to find the difference between those, I made a research. Unfortunately I haven't found the answer, what I found was another way of creating table with foreign key (very similar to the second one):
CREATE TABLE books
(
book_id NVARCHAR(15) NOT NULL UNIQUE,
author_id INT,
FOREIGN KEY(author_id) REFERENCES authors(author_id)
...
);
Could you point out the differences between all of them?
Functionally, there is no difference between the two. The first is called an inline constraint (and can be used for check constraints as well).
There are two minor difference. The first is that the constraint keyword is not necessary of the inline reference, so inline references often do not name the constraint (constraint is allowed and you can name the reference, but that is not the syntax you show).
The second is that the foreign key reference can only use one column. For me, this is almost never an issue, because I almost always have synthetic primary keys rather than composite primary keys. However, the inline syntax is less powerful than the full constraint definition.
By the way, there is a third method which uses alter table. This is similar to your second method, but it allows for constraints to be added after a table has already been created.

what is meaning of oracle database query: primary key references t_name(col_name)?

create table books
(
bid number(5) primary key,
name varchar2(30)
);
create table members
(
mid number(5) primary key,
name varchar2(30)
);
create table issues
(
bid number(5) primary key
references books(bid),
mid number(5)
references members (mid)
);
I have 3 tables first two tables are simple but what is the meaning of third table as I know foreign key references t_name(col_name); but what is meaning of primary key references t_name(col_name) and col_name references t_name(col_name); ?
It is no special case. Here the primary key bid of table issues is referencing to the column bid of table books. This simply means that bid of issues will have only those values which are present in bid of books. It will act as the primary key of table issues so it will have unique value and it's values will be limited to those contained in books table.
So it simply means it is primary key value with it's values in table books.
It is the same as any other references statement. This is saying that the primary key also references Books(bid).
I can think of two reasons why this type of construct would be used. First, the "issues" entity could be a subset of the "book" entity. This would allow additional issues-specific columns to be stored in issues, without cluttering up books. It also allows foreign keys to either issues or books.
The second reason is that this is one way of implementing vertical partitioning. This occurs when a table has a lots of columns. For performance reasons, you want to separate them into different storage areas. This is sort of similar to what columnar databases do, but it has the overhead of the additional primary key.

SQL Foreign key abrevation

Are these T-SQL declarations equals?
CREATE TABLE Person
(
ID INT PRIMARY KEY,
NAME VARCHAR(60)
)
CREATE TABLE Dog
(
CHIP_ID INT PRIMARY KEY,
OWNER_ID INT REFERENCES Person(ID)
)
and
CREATE TABLE Person
(
ID INT PRIMARY KEY,
NAME VARCHAR(60)
)
CREATE TABLE Dog
(
CHIP_ID INT PRIMARY KEY,
OWNER_ID INT,
FOREIGN KEY(OWNER_ID) REFERENCES Person(ID)
)
I'm talking of course about the foreign key, I'm not sure if I have to specify it is a foreign key or not.
Thank you.
Yes, the DBMS see both as the same. But humans can many times miss important details when the code is cryptic. In fact, my preference is this:
CREATE TABLE Person(
ID INT not null,
Name VARCHAR(60) not null,
constraint PK_Person primary key( ID )
);
CREATE TABLE Dog(
ID INT not null,
OwnerID INT,
constraint PK_Dog primary key( CHIP_ID ),
constraint FK_Dog_Owner foreign key( OWNER_ID ) REFERENCES Person( ID )
);
Using the constraint clause not only defines the primary and foreign keys, but allow us to give them a meaningful name. And the surrogate key of each table should be named "ID". Any foreign keys in other tables will expand that name according to its context (RoleID). As you have in the Dog table with OwnerID. Another table with a FK to the same Person table may name it GroomerID or whatever else shows the role that person plays in the context of the table.
Also, as you can see, I prefer CamelCase with SQL Server, leaving OWNER_ID for Oracle.
Some even go so far as to place either NULL or NOT NULL after each column definition. But I find that adds clutter and doesn't really supply information even a beginning SQL developer doesn't already know. So I only supply NOT NULL when appropriate and let the default carry. Actually, in the later versions of Oracle and SQL Server, the NOT NULL for the primary key field is optional as the primary key is going to be defined as NOT NULL no matter what.
Long ago there seemed to be an informal contest to see who could cram the most operations into the fewest words or even characters. Just stay far away from that kind of thinking. But do make everything you write meaningful.
In general, use every opportunity to add meaningful information to the code. The computer doesn't care. Write to the other developers who will follow you.
Both T-SQL will create the foreign key you need. However, I believe the second approach where the code explicitely states "FOREIGN KEY..." is a good contribution to keep easy-maintenance and clean code for future software engineer understanding.

Creating Tables

Suppose you have the following database:
Person(ssn NUMERIC(9), name VARCHAR(40), gender CHAR(1)), ssn is primary key
Organization(org_code CHAR(4), budget INTEGER, org_name VARCHAR(60)), org_code is primary key
Person_Organization(ssn, org_code), both columns are the primary key.
Are the keys in the person_organization table considered foreign keys or primary keys? I am stuck on how to create this table. Have tried looking in my textbooks but cannot find information about it. I don't know if they are supposed to be foreign keys that reference the primary keys or if I should just do this
CREATE TABLE person_organization(ssn NUMERIC(9), org_code VARCHAR(60));
Any suggestions would be greatly appreciated.
Thanks.
The simple answer is that they're both.
ssn, org_code should be the primary key of person_organization.
ssn should be a foreign key back into person and org_code should by a foreign key back into organization.
To separate myself from northpole's answer I don't actually agree with the surrogate key argument in this case it doesn't seem to be needed as it won't be used anywhere else.
Unfortunately the problem with this (good) solution to the many to many relationship is that it's often needed to have two unique keys on a table, ssn, org_code and org_code, ssn and choose one as the primary key.
As you're using Oracle the create table syntax would be
create table person_organization
( ssn number(9)
, org_code varchar2(60)
, constraint person_organization_pk primary key (ssn, org_code)
, constraint person_organization_ssn_fk foreign key ( ssn )
references person ( ssn )
, constraint person_organization_oc_fk foreign key ( org_code )
references organization ( org_code )
);
In your original table creation script you had ssn as numeric(9), which should by number(9). You may want to consider not restricting the size of this data type. You also had org_code as a varchar, this should probably be a varchar2.
Tech on the Net is a really good resource for learning syntax.
I would suggest adding a unique, auto incrementing primary key to PERSON_ORGANIZATION (called something like po_id) as well as the two FOREIGN keys of ssn and org_code. You can also make those two unique if you want. From my experience, I like to have almost every table have it's own unique/auto key (unless it is a lookup table or audit table (and possibly others)).
They're both.
For the person_organization table you would have a compound primary key that consisted of the two columns. Each is separately a foreign key to another table.
For normal database design they should reference the primary keys in the other tables and these constraints enforce the validity of the data in the database.
They are foreign keys.
You've listed "both columns are the primary key" but I don't think they are.
The table does not have a primary key.
The combination of the two fields is certainly acting as a proxy for a primary key, doing things like making sure entries are uniquely identified and thus acting together as a unique identifier but that is a bit different.
I would also recommend adding a separate primary key field for consistency with the structure of others tables. As with other tables I recommend always using either id [my favorite] or tablename_id
This is the basic idea, you need to provide proper datatype for each field
CREATE TABLE Persons (
ssn int(9) NOT NULL PRIMARY KEY,
name varchar(40),
gender CHAR(1)
)
CREATE TABLE Organization (
org_code CHAR(4)NOT NULL PRIMARY KEY,
budget INTEGER,
org_name VARCHAR(60)
)
CREATE TABLE Person_Organization (
ssn int FOREIGN KEY REFERENCES Persons(ssn),
org_code CHAR FOREIGN KEY REFERENCES Organization(org_code)
)