SQL table: create 1-to-1 relationship with itself? - sql

I want to create a 1-to-1 relationship on a table with itself.
I have a table MenuItem, but I want the items to be able to have a parent MenuItem. One item can only have one parent, but an item can be parent to multiple items.
I am currently working with a link table, MenuItemParent, but I can't figure out how to get the keys and constraints correctly. It has two columns: MenuItemId and ParentId. Both are foreign keys to the MenuItem table.
If I make the first or both columns Primary key, I seem to end up with a 1-to-many relationship. (I'm generating code from the DB so I can verify it.)
If I only make the first column Primary Key, I end up in a sort of Schrödinger state where a MenuItem can both have a single parent and have multiple parents (i.e. the generated POCO has both a MenuItem property and an EntitySet<MenuItem> property.) I could build my code around this, but then it's not clear from either the model or the generated code what kind of relationship it actually is.
What am I missing?
As to why I'm using a link table, I'm trying to employ vertical segmentation, as this data will not be accessed as often.
A 1-1 relationship effectively partitions the attributes (columns) in
a table into two tables. This is called vertical segmentation. This is
often done for sub-classing the table entities, or, for another
reason, if the usage patterns on the columns in the table indicate
that a few of the columns need to be accessed significantly more often
than the rest of the columns. (Say one or two columns will be accessed
1000s of times per second and the other 40 columns will be accessed
only once a month). Partitioning the table in this way in effect will
optimize the storage pattern for those two different queries.
From: https://stackoverflow.com/a/5112498/125938
Edit: premature optimization aside, I now understand I could simply use a ParentId column in the MenuItem table, but is this really better than using a link table?

You should add a ParentID column to your table MenuItem with a foreign key.
This is an example on how to do that.
alter table MenuItem
add ParentID int null;
alter table MenuItem
add constraint FK_MenuItemParent foreign key (ParentID) references MenuItem (ID);
Now you have an hierarchical table, which means that a menuitem can have only one parent, but many other menuitems can have the same menuitem as parent
A Link Table is only needed when you need a many to many relationship, which is not the case for this
Also you can create an unique index on both columns, as suggested, but beware that the ParentID can be null often so add a clause to fix that
create unique nonclustered index idx_MenuParentID
on MenuItem(ID, ParentID)
where ParentID is not null;

Get rid of the "link" table. Just setup your MenuItem table with an ID (PK) column and a ParentID (FK) column. Setup the foreign key relationship (I'll assume you can figure that out). Then setup a "Unique Key" constraint on the ParentID and ID columns.

I think you should try to have 1 column is PRIMARY KEY, and the other is FOREIGN KEY REFERENCES from MenuItem. Because the 1-1 relationship with itself in database called self-reference(you can search google for more info), it can't have two FOREIGN KEY.

Related

How can I replace the existing primary key with a new primary key on my table?

I'm working with a legacy SQL Server database which has a core table with a bad primary key.
The key is of type NVARCHAR(50) and contains an application-generated string based on various things in the table. For obvious reasons, I'd like to replace this key with an auto-incrementing (identity) INT column.
This is a huge database and we're upgrading it piece-by-piece. We want to minimize the changes to tables that other components write to. I figured I could change the table without breaking anything by just:
Adding the new Id column to the table and making it nullable
Filling it with unique integers and making it NOT NULL
Dropping the existing primary key while ensuring there's a uniqueness constraint still on that column
Setting the new Id column to be the new primary key and identity
Item 3 is proving very painful. Because this is a core table, there are a lot of other tables with foreign key constraints on it. To drop the existing primary key, it seems I have to delete all these foreign key constraints and create them again afterwards.
Is there an easier way to do this or will I just have to script everything?
Afraid that is the bad news. We just got through a big project of doing the same type of thing, although our head DBA had a few tricks up his sleeve. You might look at something like this to get your scripts generated for the flipping of the switch:
I once did the same thing and basically used the process you describe. Except of course you have to first visit each other table and add new foreign key pointing to the new column in your base table
So the approach I used was
Add a new column with an auto incrementing integer in the base table, ensure it has a unique index on it (to be replaced later by the primary key)
For each foreign key relationship pointing to the base table add a new column in the child table. (note this can result in adding more than one column in the child table if more than one relationship)
For each instance of a key in the child table enter a value into the new foreign key field(s)
Replace your foreign key relationships such that the new column now serves
Make the new column in the base table the primary
Drop the old primary key in the base table and each old foreign key in the
children.
It is doable and not as hard as it might sound at first. The crux is a series of update statements for the children table of the nature
Update child_table
set new_column = (select new_primary from base)
where old_primary = old_foreign

Adding foreign keys to an already existing database

I'm trying to export data from an Excel spreadsheet into a fairly complex relational database. The spreadsheet indicates "foreign keys" by stating the names of other objects. (Luckily, I have some control over the spreadsheet, so I can guarantee these names are unique AND that the objects they reference actually exist).
I have a program that can recreate these tables in a MSSql database, but it can't automatically link them to each other. Besides, I don't want to use the actual names of the objects as the primary key since eventually the database will be large.
So, if I have many existing but unconnected tables which refer to each other by their "name" fields, how can I add a foreign key that links them by their IDs?
A simplified version of what I have:
Parent
ID: 1 (PK)
Name: Mary
Child
ID: 2 (PK)
Name: Jane
ParentName: Mary
And what I want to achieve:
Child
ID: 2 (PK)
Name: Jane
ParentID: 1 (FK)
Thanks for any help! I wasn't able to find an example of how to add a foreign key mapping after the fact, or on a different field.
See the ALTER TABLE syntax for MSSQL. You can come up with something like this to add the constraint to the table:
ALTER Child
ADD CONSTRAINT Child_Parent_FK FOREIGN KEY (ParentID) REFERENCES Parent(ID)
Then once the constraint is in, try something like:
UPDATE Child
SET ParentID = (SELECT ID FROM Parent WHERE Name = ParentName)
That should work if you can guarantee the Name of the Parent is unique. Otherwise you can add LIMIT 1 to the end of the query. But if there are multiple Parents with the same Name, you're going to need to add extra logic (which isn't specified in your original post).
Since you're going to be doing this regularly, I think you should import into a staging table. I like to isolate staging tables in their own schema.
Use the staging table to retrieve or generate the keys you need, then insert/update your OLTP tables based on the data in the staging table. Finally, truncate the staging table.

Copying 1-to-1 relationships with identity fields in SQL

Suppose you have two tables that are in a one-to-one relationship; i.e. the primary key of the child table is also the foreign key that links it to the parent table. Suppose also that the primary key of the parent is an identity field (a monotonically increasing integer that is assigned by the database when the record is inserted).
Suppose that you need to copy records from these two tables into a second pair of identical tables -- the primary key of the parent is an identity, and the foreign key linking the child to the parent is also the child's primary key.
How should I copy records from one set of tables to the other set?
I currently have three solutions, but I'd like to know if there are others that are better.
Option 1: Temporarily disable the
identity property in the destination
parent table. Copy records from the
parent table, then the child table,
keeping the same values for the
primary key. Cross your fingers that
there are no conflicts (value of
primary key of source table already
exists in destination table).
Option 2: Temporarily add a column to
the destination parent table to hold
the "old" (source) primary key. Copy
records from the parent table,
allowing the database to assign a new
primary key but saving the old
primary key in the temporary column.
Copy records from the child table,
joining the source child table to the
destination parent table via the the
old primary key, using the join to
insert the record into the
destination child table with the new
primary key. Drop the temporary
column from the destination parent
table.
Option 3: Copy sequentially
record-by-record, first from parent
to parent, then child to child, using
DB-provided "identity of last
inserted record" functions to ensure
that the link is maintained.
Of these options, I think option 2 is my preference. Does anyone prefer one of the other two options, and if so, why? Does anyone have a different solution that is "better"?
This is one reason why it is so critical to remember that even if you use a surrogate key (like an identity column), you always need a business key. I.e., there always need to be some other unique constraint on the table. If you had that, then another choice would be to insert the values into the copy of the parent table without the identity values and use that unique key to insert the proper parent value for the child rows.
If you do not have that unique key, then given your situation, I agree that your best solution would likely be Option #2.
Before you decide on an approach to copy data to new set of tables, you should investigate following items:
a list of tables that reference the data from the parent and child tables (both sets of tables)
Are there any stored procedures/triggers that utilize the data in these tables?
How does this table get populated? Is there an application/data feed that inserts data in this table?
How does the data in this table get deleted?
What is the purpose of the primary key beyond ensuring uniqueness in the table? For this you will have to understand how the data in the table is used by the application.
Based on the answers, you should be able to pick the right solution that will meet the requirements of the application.
My money is on Option 1 (see SET IDENTITY INSERT, http://msdn.microsoft.com/en-us/library/ms188059.aspx).
But: Why are you copying them?
If you are just altering the table schema, or migrating to new tables and retiring the old ones, why not use ALTER TABLE.
If you are going to run them side-by-side you probably need the keys to match.
But to answer your question, use Option 1, definitely.

Database how to model 1:1 relationship

(VS2008, SqlCE 3.5)
I try to model a 1:1 relationship. So I put the foreign key in the parent table, holding the PK of the child table. Then I set the foreign key to UNIQUE. Still when I create my entity classes (With SqlMetal), the child class has a reference to an EntitySet of the parent, not just a single entity. This seems like a m:1 relation? So what I need to do to make it 1:1 ?
EDIT1:
I'm confused.. Trying to make a set, like this:
StrategySet(ID, EntryStrategyID{Unique}, ExitStrategyID{Unique})
EntryStrategy(ID)
ExitStrategy(ID)
Is this m:1 isn't it? Though it looks like FK's are in the parent, or wouldn't we name StrategySet the parent? And how would I now change this too 1:1 ?
First of all, the parent is table which is referenced by FK from child. So you can't say that your parent table references the child: it's not correct.
Secondly, 1:1 relations can be made through:
Primary Keys in both tables
Primary Key in parent and Unique Foreign Key in child
So in your case, the architecture is correct. I suppose you should check the structure again, and look through this article.
If all columns in EntryStrategy and ExitStrategy are the same, then all you need is simply this (add all other columns too).
If EntryStrategy and ExitStrategy have some different columns, then use this. Keep all common columns in the Strategy table. EntryStrategy and ExitStrategy have only columns specific to each one.
Here is also a generic example of 1:1 due to vertical partitioning of a table.
Before:
After:
Let me understand the situation you are describing.
You have set of fields which make up a "Strategy". A subset of the fields are conceptually the "EntryStrategy" and a non-intersecting subset of the fields are the "ExitStrategy".
In your case a given set of values making up an "EntryStrategy" can be joined with one and only one set of values making up an "ExitStrategy". This is what you mean when you say there is a 1:1 correspondence.
As smirkingman said earlier, in classic relational database modeling, all of these fields belong in a single table because no subset of the fields appear in more than one record.
If you could have multiple ExitStrategies for a single EntryStrategy then you would have two tables with the EntryStrategy being the parent and the ExitStrategies being the children and the ExitStrategy records would have a Foreign Key pointing to the EntryStrategy parent record.
If you could have multiple EntryStrategies for a single ExitStrategy then you would have two tables with the ExitStrategy being the parent and the EntryStrategies being the children and the EntryStrategy records would have a Foreign Key pointing to the ExitStrategy parent record.
If you could have multiple EntryStrategies associated with multiple ExitStrategies then you would have a many-to-many relationship which requires a third table to maintain the correspondences.
The principles of classic database modeling would put all your fields in one table.
As St Woland wrote, you can enforce the 1:1 relationship by having two tables where the foreign key in the child table is a Unique index. But two tables are normally used for 1-to-many relationships.
As Damir wrote, you can enforce the 1:1 relationship by having three tables where the third table has a foreign key to each of the other two tables and both foreign key fields are marked as Unique indices. However, normally you only use three tables in this fashion when you have a many-to-many relationship.
I think you are expecting way too much from the automated data modeling tools to expect them to construct entities that represent your very unconventional approach.
The answer to your main question is simple. How do I represent a 1:1 relationship? You put them in the same record in a single table!

SQL: Do you need an auto-incremental primary key for Many-Many tables?

Say you have a Many-Many table between Artists and Fans. When it comes to designing the table, do you design the table like such:
ArtistFans
ArtistFanID (PK)
ArtistID (FK)
UserID (FK)
(ArtistID and UserID will then be contrained with a Unique Constraint
to prevent duplicate data)
Or do you build use a compound PK for the two relevant fields:
ArtistFans
ArtistID (PK)
UserID (PK)
(The need for the separate unique constraint is removed because of the
compound PK)
Are there are any advantages (maybe indexing?) for using the former schema?
ArtistFans
ArtistID (PK)
UserID (PK)
The use of an auto incremental PK has no advantages here, even if the parent tables have them.
I'd also create a "reverse PK" index automatically on (UserID, ArtistID) too: you will need it because you'll query the table by both columns.
Autonumber/ID columns have their place. You'd choose them to improve certain things after the normalisation process based on the physical platform. But not for link tables: if your braindead ORM insists, then change ORMs...
Edit, Oct 2012
It's important to note that you'd still need unique (UserID, ArtistID) and (ArtistID, UserID) indexes. Adding an auto increments just uses more space (in memory, not just on disk) that shouldn't be used
Assuming that you're already a devotee of the surrogate key (you're in good company), there's a case to be made for going all the way.
A key point that is sometimes forgotten is that relationships themselves can have properties. Often it's not enough to state that two things are related; you might have to describe the nature of that relationship. In other words, there's nothing special about a relationship table that says it can only have two columns.
If there's nothing special about these tables, why not treat it like every other table and use a surrogate key? If you do end up having to add properties to the table, you'll thank your lucky presentation layers that you don't have to pass around a compound key just to modify those properties.
I wouldn't even call this a rule of thumb, more of a something-to-consider. In my experience, some slim majority of relationships end up carrying around additional data, essentially becoming entities in themselves, worthy of a surrogate key.
The rub is that adding these keys after the fact can be a pain. Whether the cost of the additional column and index is worth the value of preempting this headache, that really depends on the project.
As for me, once bitten, twice shy – I go for the surrogate key out of the gate.
Even if you create an identity column, it doesn't have to be the primary key.
ArtistFans
ArtistFanId
ArtistId (PK)
UserId (PK)
Identity columns can be useful to relate this relation to other relations. For example, if there was a creator table which specified the person who created the artist-user relation, it could have a foreign key on ArtistFanId, instead of the composite ArtistId+UserId primary key.
Also, identity columns are required (or greatly improve the operation of) certain ORM packages.
I cannot think of any reason to use the first form you list. The compound primary key is fine, and having a separate, artificial primary key (along with the unique contraint you need on the foreign keys) will just take more time to compute and space to store.
The standard way is to use the composite primary key. Adding in a separate autoincrement key is just creating a substitute that is already there using what you have. Proper database normalization patterns would look down on using the autoincrement.
Funny how all answers favor variant 2, so I have to dissent and argue for variant 1 ;)
To answer the question in the title: no, you don't need it. But...
Having an auto-incremental or identity column in every table simplifies your data model so that you know that each of your tables always has a single PK column.
As a consequence, every relation (foreign key) from one table to another always consists of a single column for each table.
Further, if you happen to write some application framework for forms, lists, reports, logging etc you only have to deal with tables with a single PK column, which simplifies the complexity of your framework.
Also, an additional id PK column does not cost you very much in terms of disk space (except for billion-record-plus tables).
Of course, I need to mention one downside: in a grandparent-parent-child relation, child will lose its grandparent information and require a JOIN to retrieve it.
In my opinion, in pure SQL id column is not necessary and should not be used. But for ORM frameworks such as Hibernate, managing many-to-many relations is not simple with compound keys etc., especially if join table have extra columns.
So if I am going to use a ORM framework on the db, I prefer putting an auto-increment id column to that table and a unique constraint to the referencing columns together. And of course, not-null constraint if it is required.
Then I treat the table just like any other table in my project.