Use or not Key as both FK and PK - sql

On a database I have the following:
USERS
Id (PK)
Name
DOCTORS
Id (PK)
UserId (FK)
CurriculumVitae
Birthdate
...
All doctors I also Users of the application but have extra columns and I am displaying all doctors on a page.
Does it make sense to have Id and UserId on DOCTORS table?
Or should I make UserId both the PK and FK of table DOCTORS:
DOCTORS
UserId (PK, FK)
CurriculumVitae
Birthdate
...
In your opinion when should I go for this option?

Yes, you can make UserId (as you have defined it) both the primary key and foreign key. The primary key specifies that it is unique within the table. The foreign key maps it back to the Users table.
This is an example of a subsetting relationship.
I would actually call the columns Users.UserId and Doctors.DoctorId. I think that is less confusing nomenclature that captures the most important aspect of the two columns. However, the actual names are not important for your question (and yours are reasonable).

If you did not make it a PK also then you could have duplicates which is probably not what you want.

Related

Is this a good database design practice?

I got a Person table, each Person can visit several countries. The countries visited by each Person is stored in table CountryVisit
Person:
PersonId,
Name
CountryVisit:
CountryVisitId (primary key)
PersonId (foreign key to 'Person.PersonId')
CountryName
VisitDate
For the CountryVisit Table, my primary key is CountryVisitId which is an identity column. This design will result in that a Person can have only 1 CountryVisit but the CountryVisitId can be 40 for example. Is it a better practice to create another surrogate key column to act as an identity column while the CountryVisitId be a natural key that is unique for each PersonId ?
It is pretty good. I would suggest that you have a separate table for countries, with one row per country. Then the CountryVisits table would have:
CountryVisitId PrimaryKey,
PersonId ForeignKey,
CountryId ForeignKey,
VisitDate
This will ensure that the country name is always spelled correctly and consistently. If you want a list of countries to get started, check out this Wikipedia page. Also note that your definition of country may be different from the standard list of countries (there are actually several out there), so you should use your own auto-incremented primary key, rather than using the country code.
And, you should relax the requirement and remove the unique or primary key on PersonId, CountryId, unless you want to enforce only one visit per country.

MS SQL creating many-to-many relation with a junction table

I'm using Microsoft SQL Server Management Studio and while creating a junction table should I create an ID column for the junction table, if so should I also make it the primary key and identity column? Or just keep 2 columns for the tables I'm joining in the many-to-many relation?
For example if this would be the many-to many tables:
MOVIE
Movie_ID
Name
etc...
CATEGORY
Category_ID
Name
etc...
Should I make the junction table:
MOVIE_CATEGORY_JUNCTION
Movie_ID
Category_ID
Movie_Category_Junction_ID
[and make the Movie_Category_Junction_ID my Primary Key and use it as the Identity Column] ?
Or:
MOVIE_CATEGORY_JUNCTION
Movie_ID
Category_ID
[and just leave it at that with no primary key or identity table] ?
I would use the second junction table:
MOVIE_CATEGORY_JUNCTION
Movie_ID
Category_ID
The primary key would be the combination of both columns. You would also have a foreign key from each column to the Movie and Category table.
The junction table would look similar to this:
create table movie_category_junction
(
movie_id int,
category_id int,
CONSTRAINT movie_cat_pk PRIMARY KEY (movie_id, category_id),
CONSTRAINT FK_movie
FOREIGN KEY (movie_id) REFERENCES movie (movie_id),
CONSTRAINT FK_category
FOREIGN KEY (category_id) REFERENCES category (category_id)
);
See SQL Fiddle with Demo.
Using these two fields as the PRIMARY KEY will prevent duplicate movie/category combinations from being added to the table.
There are different schools of thought on this. One school prefers including a primary key and naming the linking table something more significant than just the two tables it is linking. The reasoning is that although the table may start out seeming like just a linking table, it may become its own table with significant data.
An example is a many-to-many between magazines and subscribers. Really that link is a subscription with its own attributes, like expiration date, payment status, etc.
However, I think sometimes a linking table is just a linking table. The many to many relationship with categories is a good example of this.
So in this case, a separate one field primary key is not necessary. You could have a auto-assign key, which wouldn't hurt anything, and would make deleting specific records easier. It might be good as a general practice, so if the table later develops into a significant table with its own significant data (as subscriptions) it will already have an auto-assign primary key.
You can put a unique index on the two fields to avoid duplicates. This will even prevent duplicates if you have a separate auto-assign key. You could use both fields as your primary key (which is also a unique index).
So, the one school of thought can stick with integer auto-assign primary keys, and avoids compound primary keys. This is not the only way to do it, and maybe not the best, but it won't lead you wrong, into a problem where you really regret it.
But, for something like what you are doing, you will probably be fine with just the two fields. I'd still recommend either making the two fields a compound primary key, or at least putting a unique index on the two fields.
I would go with the 2nd junction table. But make those two fields as Primary key. That will restrict duplicate entries.

If two tables have a "One To One" relationship, should they have the same column as primary key?

Let's say I have a table called Users which represents registered users of a website. I also have an AccountActivation table which stores the randomly generated string sent to a new user's email to verify that email.
The AccountActivation table has UserId column which also happen to be the primary key for the Users table. It also has the ActivationCode column to store the code. Either column could uniquely identify a row in the AccountActivation table.
So if I choose the activation code column as the primary key, I end up having two one-to-one tables with different primary keys. I thought in one to one relationship, the two tables must have the same primary key?
If you choose ActivationCode as PK, then why do you have two one-to-one relations?
The only relation that's there is
AccountActivation.UserId -> Users.UserId
or what else do you think you suddenly have?
If go do what you suggested, then the table Users has its PK on UserId and table AccountActivation has its PK on ActivationCode - not a problem at all, and there's no reason not to do it this way.
Which column (UserId or ActivationCode) you pick for the PK of AccountActivation doesn't matter - that doesn't influence / disturb the FK relationship between AccountActivation and User, nor does it add an extra one-to-one relationship of any kind .....
If you do choose ActivationCode for the PK of AccountActivation, the only extra step that I would take is creating a nonclustered index on UserId so that queries that join the two tables will benefit from maximum performance.
If there is only to be one ActivationCode they could share the UserId. But that would imply that when a user re-generated a key you should, either update the old row or delete it.
But why you need to store such data? You could also composite the account activation code with some sort of computation and encryption with unique data from the User.
Just to illustrate my suggestion:
Users table has two columns UserId CreationDate
Then the token might be UserId + CreationDate (example). You would be able to generate and check it without the extra data in the database. I know that this might not suite your requirements.
Make the UserId column in the AccountActivation a foreign key to the Users table.
Users
=====
UserId primary key
Name
Address
etc...
AccountActivation
=================
UserId primary key (foreign key to Users.UserId)
ActivationCode (unique constraint)
Now you have a one-to-one relationship
You need not have the same column as primary key in 2 tables to have one-to-one relationship.
You can have any column as primary key in the AccountActivation table.
UserId which is the foreign key to the AccountActivation table is the primary key in Users table. So, you should definitely be able to uniquely identify a users activation code from the AccountActivation table using this column, no matter whether it is primary key of that table(But it should be unique and I hope it would).

implementing UNIQUE across linked tables in MySQL

a USER is a PERSON and a PERSON has a COMPANY - user -> person is one-to-one, person -> company is many-to-one.
person_id is FK in USER table.
company_id is FK in PERSON table.
A PERSON may not be a USER, but a USER is always a PERSON.
If company_id was in user table, I could create a unique key based on username and company_id, but it isn't, and would be a duplication of data if it was.
Currently, I'm implementing the unique username/company ID rule in the RoseDB manager wrapper code, but it feels wrong. I'd like to define the unique rule at the DB level if I can, but I'm not sure excactly how to approach it. I tried something like this:
alter table user add unique(used_id,person.company_id);
but that doesn't work.
By reading through the documentation, I can't find an example that does anything even remotely similar. Am I trying to add functionality that doesn't exist, or am I missing something here?
Well, there's nothing simple that does what you want. You can probably enforce the constraint you need using BEFORE INSERT and BEFORE UPDATE triggers, though. See this SO question about raising MySQL errors for how to handle making the triggers fail.
Are there more attributes to your PERSON table? Reason I ask is that what you want to implement is a typical corollary table:
USERS table:
user_id (pk)
USER_COMPANY_XREF (nee PERSON) table:
user_id (pk, fk)
company_id (pk, fk)
EFFECTIVE_DATE (not null)
EXPIRY_DATE (not null)
COMPANIES table:
company_id (pk)
The primary key of the USER_COMPANY_XREF table being a composite key of USERS.user_id and COMPANIES.company_id would allow you to associate a user with more than one company while not duplicating data in the USERS table, and provide referencial integrity.
You could define the UNIQUE constraint in the Person table:
CREATE TABLE Company (
company_id SERIAL PRIMARY KEY
) ENGINE=InnoDB;
CREATE TABLE Person (
person_id SERIAL PRIMARY KEY,
company_id BIGINT UNSIGNED,
UNIQUE KEY (person_id, company_id),
FOREIGN KEY (company_id) REFERENCES Company (company_id)
) ENGINE=InnoDB;
CREATE TABLE User (
person_id BIGINT UNSIGNED PRIMARY KEY,
FOREIGN KEY (person_id) REFERENCES Person (person_id)
) ENGINE=InnoDB;
But actually you don't need the unique constraint even in the Person table, because person_id is already unique on its own. There's no way a given person_id could reference two companies.
So I'm not sure what problem you're trying to solve.
Re your comment:
That doesn't solve the issue of allowing the same username to exist in different companies.
So you want a given username to be unique within one company, but usable in different companies? That was not clear to me from your original question.
So if you don't have many other attributes specific to users, I'd combine User with Person and add an "is_user" column. Or just rely on it being implicitly true that a Person with a non-null cryptpass is by definition a User.
Then your problem with cross-table UNIQUE constraints goes away.

Database Design

I am making a webapp right now and I am trying to get my head around the database design.
I have a user model(username (which is primary key), password, email, website)
I have a entry model(id, title, content, comments, commentCount)
A user can only comment on an entry once. What is the best and most efficient way to go about doing this?
At the moment, I am thinking of another table that has username (from user model) and entry id (from entry model)
**username id**
Sonic 4
Sonic 5
Knuckles 2
Sonic 6
Amy 15
Sonic 20
Knuckles 5
Amy 4
So then to list comments for entry 4 it searches for id=4.
On a side note:
Instead of storing a commentCount, would it be better to calculate the comment count from the database each time when needed?
Your design is basically sound. Your third table should be named something like UsersEntriesComments, with fields UserName, EntryID and Comment. In this table, you would have a compound primary key consisting of the UserName and EntryID fields; this would enforce the rule that each user can comment on each entry only once. The table would also have foreign key constraints such that UserName must be in the Users table, and EntryID must be in the Entries table (the ID field, specifically).
You could add an ID field to the Users table, but many programmers (myself included) advocate the use of "natural" keys where possible. Since UserNames must be unique in your system, this is a perfectly valid (and easily readable) primary key.
Update: just read your question again. You don't need the Comments or the CommentsCount fields in your Entries table. Comments would properly be stored in the UsersEntriesComments table, and the counts would be calculated dynamically in your queries (saving you the trouble of updating this value yourself).
Update 2: James Black makes a good point in favor of not using UserName as the primary key, and instead adding an artificial primary key to the table (UserID or some such). If you use UserName as the primary key, allowing a user to change their user name is more difficult, as you have to change the username in all the related tables as well.
What exactly do you mean by
entry model(id, title, content, **comments**, commentCount)
(emphasis mine)? Since it looks like you have multiple comments per entity, they should be stored in a separate table:
comments(id, entry_id, content, user_id)
entry_id and user_id are foreign keys to respective tables. Now you just need to create a unique index on (entry_id, user_id) to ensure user can only add one comment per entity.
Also, you may want to create a surrogate (numeric, generated via sequence / identity) primary key for your users table instead of making user name your PK.
Here's my recommendation for your data model:
USERS table
USER_ID (pk, int)
USER_NAME
PASSWORD
EMAIL
WEBSITE
ENTRY table
ENTRY_ID (pk, int)
ENTRY_TITLE
CONTENT
ENTRY_COMMENTS table
ENTRY_ID (pk, fk)
USER_ID (pk, fk)
COMMENT
This setup allows an ENTRY to have 0+ comments. When a comment is added, the primary key being a composite key of ENTRY_ID and USER_ID means that the pair can only exist once in the table (IE: 1, 1 won't allow 1, 1 to be added again).
Do not store counts in a table - use a VIEW for that so the number can be generated based on existing data at the time of execution.
I wouldn't use the username as a primary ID. I would make a numeric id with autoincrement
I would use that new id in the relations table with a unique key on the 2 fields
Even though it isn't in the question, you may want to have a userid that is the primary key, otherwise it will be difficult if the user is allowed to change their username, or make certain people know you cannot change your username.
Make the joined table have a unique constraint on the userid and entryid. That way the database forces that there is only one comment/entry/user.
It would help if you specified a database, btw.
It sounds like you want to guarantee that the set of comments is unique with respect to username X post_id. You can do this by using a unique constraint, or if your database system doesn't support that explicitly, with an index that does the same. Here's some SQL expressing that:
CREATE TABLE users (
username VARCHAR(10) PRIMARY KEY,
-- any other data ...
);
CREATE TABLE posts (
post_id INTEGER PRIMARY KEY,
-- any other data ...
);
CREATE TABLE comments (
username VARCHAR(10) REFERENCES users(username),
post_id INTEGER REFERENCES posts(post_id),
-- any other data ...
UNIQUE (username, post_id) -- Here's the important bit!
);