I have a table where I have to relate groups of primary keys to one another. There is no information about these groups other than that they exist. Specifically, I am storing groups of Editions of Books together, so I can know that a certain set of books are editions of each other.
I currently have a setup where I have an Edition_Group_ISBN column, where one of the ISBN's is arbitrarily chosen to group a set of Editions together.
The typical approach for this problem is to have a separate table like Book_Editions, where I would have an autoincrementing integer primary key like "Edition_Group_ID" linking ISBNs together. I have been told this method is preferable.
However, the problem with implementing that system relates to the loading in of data. How am I to dynamically load in Edition Groups? One solution might be to lock the table and do a transaction on the next ID in the autoincrement. I imagine this would be slower and more cumbersome than my current method, though.
Given the difficulty of inserting data under that system, what is the optimal system to address this problem?
You load in Edition Groups by having an ISBN table foreign key in your Edition Groups table, and then inner-joining your two tables together in your query, using the Primary Key of your ISBN Books table, and the Foreign Key of your Edition Groups table in the join.
ISBN Table
ISBN_ID // Auto-incrementing Primary Key
ISBN
Book_Title
.etc
EDITIONS Table
Edition_ID
FIRST_EDITION_ISBN_ID
ASSOCIATED_ISBN_ID // Foreign Key to ISBN table
Most database systems have a way to return the Primary Key ID of a newly inserted record, so:
NEW_ID = INSERT INTO ISBN (ISBN, BOOK_TITLE)
VALUES (12345678, "The Frog Prince");
SELECT SCOPE_IDENTITY(); // Returns new ID from ISBN table.
INSERT INTO EDITIONS (FIRST_EDITION_ISBN_ID, ASSOCIATED_ISBN_ID)
VALUES (12345678, NEW_ID);
Related
Sometimes, there are certain tables in an application with only one column in each of them. Data of records within the respective columns are unique. Examples are: a table for country names, a table for product names (up to 60 characters long, say), a table for company codes (3 characters long and determined by the user), a table for address types (say, billing, delivery), etc.
For tables like these, as the records are unique and not null, the only column can be used as the primary key, technically speaking.
So my question is, is it good enough to use that column as the primary key for the table? Or, is it still desirable to add another column (country_id, product_id, company_id, addresstype_id) as the primary key for the table? Why?
Thanks in advance for any advice.
there is always a debate between using surrogate keys and composite keys as primary key. using composite primary keys always introduces some complexity to your database design so to your application.
think that you have another table which is needed to have direct relationship between your resulting table (billing table). For the composite key scenario you need to have 4 columns in your related table in order to connect with the billing table. On the other hand, if you use surrogate keys, you will have one identity column (simplicity) and you can create unique constraint on (country_id, product_id, company_id, addresstype_id)
but it is hard to say this approach is better then the other one because they both have Pros and Cons.
You can check This for more information
This may seem like a simple question, but I am stumped:
I have created a database about cars (in Oracle SQL developer). I have amongst other tables a table called: Manufacturer and a table called Parentcompany.
Since some manufacturers are owned by bigger corporations, I will also show them in my database.
The parentcompany table is the "parent table" and the Manufacturer table the "child table".
for both I have created columns, each having their own Primary Key.
For some reason, when I inserted the values for my columns, I was able to use the same value for the primary key of Manufacturer and Parentcompany
The column: ManufacturerID is primary Key of Manufacturer. The value for this is: 'MBE'
The column: ParentcompanyID is primary key of Parentcompany. The value for this is 'MBE'
Both have the same value. Do I have a problem with the thinking logic?
Or do I just not understand how primary keys work?
Does a primary key only need to be unique in a table, and not the database?
I would appreciate it if someone shed light on the situation.
A primary key is unique for each table.
Have a look at this tutorial: SQL - Primary key
A primary key is a field in a table which uniquely identifies each
row/record in a database table. Primary keys must contain unique
values. A primary key column cannot have NULL values.
A table can have only one primary key, which may consist of single or
multiple fields. When multiple fields are used as a primary key, they
are called a composite key.
If a table has a primary key defined on any field(s), then you cannot
have two records having the same value of that field(s).
Primary key is table-unique. You can use same value of PI for every separate table in DB. Actually that often happens as PI often incremental number representing ID of a row: 1,2,3,4...
For your case more common implementation would be to have hierarchical table called Company, which would have fields: company_name and parent_company_name. In case company has a parent, in field parent_company_name it would have some value from field company_name.
There are several reasons why the same value in two different PKs might work out with no problems. In your case, it seems to flow naturally from the semantics of the data.
A row in the Manufacturers table and a row in the ParentCompany table both appear to refer to the same thing, namely a company. In that case, giving a company the same id in both tables is not only possible, but actually useful. It represents a 1 to 1 correspondence between manufacturers and parent companies without adding extra columns to serve as FKs.
Thanks for the quick answers!
I think I know what to do now. I will create a general company table, in which all companies will be stored. Then I will create, as I go along specific company tables like Manufacturer and parent company that reference a certain company in the company table.
To clarify, the only column I would put into the sub-company tables is a column with a foreign key referencing a column of the company table, yes?
For the primary key, I was just confused, because I hear so much about the key needing to be unique, and can't have the same value as another. So then this condition only goes for tables, not the whole database. Thanks for the clarification!
I'm using Microsoft SQL Server Management Studio and while creating a junction table should I create an ID column for the junction table, if so should I also make it the primary key and identity column? Or just keep 2 columns for the tables I'm joining in the many-to-many relation?
For example if this would be the many-to many tables:
MOVIE
Movie_ID
Name
etc...
CATEGORY
Category_ID
Name
etc...
Should I make the junction table:
MOVIE_CATEGORY_JUNCTION
Movie_ID
Category_ID
Movie_Category_Junction_ID
[and make the Movie_Category_Junction_ID my Primary Key and use it as the Identity Column] ?
Or:
MOVIE_CATEGORY_JUNCTION
Movie_ID
Category_ID
[and just leave it at that with no primary key or identity table] ?
I would use the second junction table:
MOVIE_CATEGORY_JUNCTION
Movie_ID
Category_ID
The primary key would be the combination of both columns. You would also have a foreign key from each column to the Movie and Category table.
The junction table would look similar to this:
create table movie_category_junction
(
movie_id int,
category_id int,
CONSTRAINT movie_cat_pk PRIMARY KEY (movie_id, category_id),
CONSTRAINT FK_movie
FOREIGN KEY (movie_id) REFERENCES movie (movie_id),
CONSTRAINT FK_category
FOREIGN KEY (category_id) REFERENCES category (category_id)
);
See SQL Fiddle with Demo.
Using these two fields as the PRIMARY KEY will prevent duplicate movie/category combinations from being added to the table.
There are different schools of thought on this. One school prefers including a primary key and naming the linking table something more significant than just the two tables it is linking. The reasoning is that although the table may start out seeming like just a linking table, it may become its own table with significant data.
An example is a many-to-many between magazines and subscribers. Really that link is a subscription with its own attributes, like expiration date, payment status, etc.
However, I think sometimes a linking table is just a linking table. The many to many relationship with categories is a good example of this.
So in this case, a separate one field primary key is not necessary. You could have a auto-assign key, which wouldn't hurt anything, and would make deleting specific records easier. It might be good as a general practice, so if the table later develops into a significant table with its own significant data (as subscriptions) it will already have an auto-assign primary key.
You can put a unique index on the two fields to avoid duplicates. This will even prevent duplicates if you have a separate auto-assign key. You could use both fields as your primary key (which is also a unique index).
So, the one school of thought can stick with integer auto-assign primary keys, and avoids compound primary keys. This is not the only way to do it, and maybe not the best, but it won't lead you wrong, into a problem where you really regret it.
But, for something like what you are doing, you will probably be fine with just the two fields. I'd still recommend either making the two fields a compound primary key, or at least putting a unique index on the two fields.
I would go with the 2nd junction table. But make those two fields as Primary key. That will restrict duplicate entries.
I have the following tables in MySQL server:
Companies:
- UID (unique)
- NAME
- other relevant data
Offices:
- UID (unique)
- CompanyID
- ExternalID
- other data
Employees:
- UID (unique)
- OfficeID
- ExternalID
- other data
In each one of them the UID is unique identifier, created by the database.
There are foreign keys to ensure the links between Employee -> Office -> Company on the UID.
The ExternalID fields in Offices and Employees is the ID provided to my application by the Company (my client(s) actually). The clients does not have (and do not care) about my own IDs, and all the data my application receives from them is identified solely based on their IDs (i.e. ExternalID in my tables).
I.e. a request from the client in pseudo-language is like "I'm Company X, update the data for my employee Y".
I need to enforce uniqueness on the combination of CompanyID and Employees.ExternalID, so in my database there will be no duplicate ExternalID for the employees of the same company.
I was thinking about 3 possible solutions:
Change the schema for Employees to include CompanyID, and create unique constrain on the two fields.
Enforce a trigger, which upon update/insert in Employees validates the uniqueness.
Enforce the check on application level (i.e. my receiving service).
My alternative-dbadmin-in-me sais that (3) is the worst solution, as it does not protect the database of inconsistency in case of application bug or something else, and most probably will be the slowest one.
The trigger solution may be what I want, but it may become complicated, especially if a multiple inserts/updates need to be performed in a single statement, and I'm not sure about the performance vs. (1).
And (1) looks the fastest and easiest approach, but kind of goes against my understanding of relational model.
What SO DB experts opinion is about pros and cons of each of the approaches, especially if there is a possibility for adding an additional level of indirection - i.e. Company -> Office -> Department -> Employee, and the same uniqueness needs to be preserved (Company/Employee).
You're right - #1 is the best option.
Granted, I would question it at first glance (because of shortcutting) but knowing the business rule to ensure an employee is only related to one company - it makes sense.
Additionally, I'd have a foreign key relating the companyid in the employee table to the companyid in the office table. Otherwise, you allow an employee to be related to a company without an office. Unless that is acceptable...
Triggers are a last resort if the relationship can not be demonstrated in the data model, and servicing the logic from the application means the logic is centralized - there's no opportunity for bad data to occur, unless someone drops constraints (which means you have bigger problems).
Each of your company-provided tables should include CompanyID into the `UNIQUE KEY' over the company-provided ids.
Company-provided referential integrity should use company-provided ids:
CREATE TABLE company (
uid INT NOT NULL PRIMARY KEY,
name TEXT
);
CREATE TABLE office (
uid INT NOT NULL PRIMARY KEY,
companyID INT NOT NULL,
externalID INT NOT NULL,
UNIQIE KEY (companyID, externalID),
FOREIGN KEY (companyID) REFERENCES company (uid)
);
CREATE TABLE employee (
uid INT NOT NULL PRIMARY KEY,
companyID INT NOT NULL,
officeID INT NOT NULL,
externalID INT NOT NULL,
UNIQIE KEY (companyID, externalID),
FOREIGN KEY (companyID) REFERENCES company(uid)
FOREIGN KEY (companyID, officeID) REFERENCES office (companyID, externalID)
);
etc.
Set auto_increment_increment to the number of table you have.
SET auto_increment_increment = 3; (you might want to set this in your my.cnf)
Then manually set the starting auto_increment value of each table to different values
first table to 1, second table to 2, third table to 3
Table 1 will have values like 1,4,7,10,13,etc
Table 2 will have values like 2,5,8,11,14,etc
Table 3 will have values like 3,6,9,12,15,etc
Of course this is just ONE option, personally I'd just make it a combo value. Could be as simple as TableID, AutoincrementID, Where the TableID is constant in all rows.
I am new to SQL Server 2008 database development.
Here I have a master table named ‘Student’ and a child table named ‘Address’. The common column between these tables is ‘Student ID’.
My doubts are:
Do we need to put ‘Address Id’ in the ‘Address’ table and make it primary key? Is it mandatory? ( I won’t be using this ‘Address Id’ in any of my reports )
Is Primary key column a must in any table?
Would you please help me on these.
Would you please also refer best links/tutorials for SQL Server 2008 database design practices (If you are aware of) which includes naming conventions, best practices, SQL optimizations etc. etc.
1) Yes, having an ADDRESS_ID column as the primary key of the ADDRESS table is a good idea.
But having the STUDENT_ID as a foreign key in the ADDRESS table is not a good idea. This means that an address record can only be associated to one student. Students can have roommates, so they'd have identical addresses. Which comes back to why it's a good idea to have the ADDRESS_ID column as a primary key, as it will indicate a unique address record.
Rather than have the STUDENT_ID column in the ADDRESS table, I'd have a corrollary/xref/lookup table between the STUDENT and ADDRESS tables:
STUDENT_ADDRESSES_XREF
STUDENT_ID, pk, fk to STUDENTS table
ADDRESS_ID, pk, fk to ADDRESS table
EFFECTIVE_DATE, date, not null
EXPIRY_DATE, date, not null
This uses a composite primary key, so that only one combination of the student & address exist. I added the dates in case there was a need to know when exactly, because someone could move back home/etc after all.
Most importantly, this works off the ADDRESS_ID column to allow for a single address to be associated to multiple people.
2) Yes, defining a primary key is frankly a must for any table.
In most databases, the act also creates an index - making searching more efficient. That's on top of the usual things like making sure a record is a unique entry...
Every table should have a way to uniquely and unambiguously identify a record. Make AddressID the primary key for the address table.
Without a primary key, the database will allow duplicate records; possibly creating join problems or trigger problems (if you implement them) down the road.