Adding foreign keys to an already existing database - sql

I'm trying to export data from an Excel spreadsheet into a fairly complex relational database. The spreadsheet indicates "foreign keys" by stating the names of other objects. (Luckily, I have some control over the spreadsheet, so I can guarantee these names are unique AND that the objects they reference actually exist).
I have a program that can recreate these tables in a MSSql database, but it can't automatically link them to each other. Besides, I don't want to use the actual names of the objects as the primary key since eventually the database will be large.
So, if I have many existing but unconnected tables which refer to each other by their "name" fields, how can I add a foreign key that links them by their IDs?
A simplified version of what I have:
Parent
ID: 1 (PK)
Name: Mary
Child
ID: 2 (PK)
Name: Jane
ParentName: Mary
And what I want to achieve:
Child
ID: 2 (PK)
Name: Jane
ParentID: 1 (FK)
Thanks for any help! I wasn't able to find an example of how to add a foreign key mapping after the fact, or on a different field.

See the ALTER TABLE syntax for MSSQL. You can come up with something like this to add the constraint to the table:
ALTER Child
ADD CONSTRAINT Child_Parent_FK FOREIGN KEY (ParentID) REFERENCES Parent(ID)
Then once the constraint is in, try something like:
UPDATE Child
SET ParentID = (SELECT ID FROM Parent WHERE Name = ParentName)
That should work if you can guarantee the Name of the Parent is unique. Otherwise you can add LIMIT 1 to the end of the query. But if there are multiple Parents with the same Name, you're going to need to add extra logic (which isn't specified in your original post).

Since you're going to be doing this regularly, I think you should import into a staging table. I like to isolate staging tables in their own schema.
Use the staging table to retrieve or generate the keys you need, then insert/update your OLTP tables based on the data in the staging table. Finally, truncate the staging table.

Related

SQL table: create 1-to-1 relationship with itself?

I want to create a 1-to-1 relationship on a table with itself.
I have a table MenuItem, but I want the items to be able to have a parent MenuItem. One item can only have one parent, but an item can be parent to multiple items.
I am currently working with a link table, MenuItemParent, but I can't figure out how to get the keys and constraints correctly. It has two columns: MenuItemId and ParentId. Both are foreign keys to the MenuItem table.
If I make the first or both columns Primary key, I seem to end up with a 1-to-many relationship. (I'm generating code from the DB so I can verify it.)
If I only make the first column Primary Key, I end up in a sort of Schrödinger state where a MenuItem can both have a single parent and have multiple parents (i.e. the generated POCO has both a MenuItem property and an EntitySet<MenuItem> property.) I could build my code around this, but then it's not clear from either the model or the generated code what kind of relationship it actually is.
What am I missing?
As to why I'm using a link table, I'm trying to employ vertical segmentation, as this data will not be accessed as often.
A 1-1 relationship effectively partitions the attributes (columns) in
a table into two tables. This is called vertical segmentation. This is
often done for sub-classing the table entities, or, for another
reason, if the usage patterns on the columns in the table indicate
that a few of the columns need to be accessed significantly more often
than the rest of the columns. (Say one or two columns will be accessed
1000s of times per second and the other 40 columns will be accessed
only once a month). Partitioning the table in this way in effect will
optimize the storage pattern for those two different queries.
From: https://stackoverflow.com/a/5112498/125938
Edit: premature optimization aside, I now understand I could simply use a ParentId column in the MenuItem table, but is this really better than using a link table?
You should add a ParentID column to your table MenuItem with a foreign key.
This is an example on how to do that.
alter table MenuItem
add ParentID int null;
alter table MenuItem
add constraint FK_MenuItemParent foreign key (ParentID) references MenuItem (ID);
Now you have an hierarchical table, which means that a menuitem can have only one parent, but many other menuitems can have the same menuitem as parent
A Link Table is only needed when you need a many to many relationship, which is not the case for this
Also you can create an unique index on both columns, as suggested, but beware that the ParentID can be null often so add a clause to fix that
create unique nonclustered index idx_MenuParentID
on MenuItem(ID, ParentID)
where ParentID is not null;
Get rid of the "link" table. Just setup your MenuItem table with an ID (PK) column and a ParentID (FK) column. Setup the foreign key relationship (I'll assume you can figure that out). Then setup a "Unique Key" constraint on the ParentID and ID columns.
I think you should try to have 1 column is PRIMARY KEY, and the other is FOREIGN KEY REFERENCES from MenuItem. Because the 1-1 relationship with itself in database called self-reference(you can search google for more info), it can't have two FOREIGN KEY.

Two FK to one PK

I'm using SQL Server Express 2012 and trying to make two relatonships, two FKs from the same table to one PK in another table.
The relationship seems to work because it shows up in the database diagram but when I try to save the changes, I receive the following error:
'Members' table saved successfully
'BookedResources' table
- Unable to create relationship 'FK_BookedResourcesMemberId_MembersMemberId'.
The ALTER TABLE statement conflicted with the FOREIGN KEY constraint "FK_BookedResourcesMemberId_MembersMemberId". The conflict occurred in database "resursBokning2", table "dbo.Members", column 'MemberId'.
MemberId in Members is the PK.
BookedResouce.EditedBy (FK) -> Member.MemberId (PK)
BookedResouce.MemberId (FK) -> Member.MemberId (PK)
Anybody know what this error is about?
I've read that it should be OK to have this kind of relationship so it should work.
From the error you've provided, it looks like you tried to name both foreign keys the same. As #kinse suggests, give each foreign key relationship a unique name. Also, consider whether you need two foreign keys to the same table - it could indicate that your database model is incomplete.
I am making an assumption that Members wouldn't be editing other members, so EditedBy (a member) and MemberId appear to be unnecessary on the members table.
The error occurs because maybe you use the same name for a foreign key twice, so change the name of the second to some other value, e.g.:
FK_BookedResourcesMemberId_MembersMemberId2

How can I replace the existing primary key with a new primary key on my table?

I'm working with a legacy SQL Server database which has a core table with a bad primary key.
The key is of type NVARCHAR(50) and contains an application-generated string based on various things in the table. For obvious reasons, I'd like to replace this key with an auto-incrementing (identity) INT column.
This is a huge database and we're upgrading it piece-by-piece. We want to minimize the changes to tables that other components write to. I figured I could change the table without breaking anything by just:
Adding the new Id column to the table and making it nullable
Filling it with unique integers and making it NOT NULL
Dropping the existing primary key while ensuring there's a uniqueness constraint still on that column
Setting the new Id column to be the new primary key and identity
Item 3 is proving very painful. Because this is a core table, there are a lot of other tables with foreign key constraints on it. To drop the existing primary key, it seems I have to delete all these foreign key constraints and create them again afterwards.
Is there an easier way to do this or will I just have to script everything?
Afraid that is the bad news. We just got through a big project of doing the same type of thing, although our head DBA had a few tricks up his sleeve. You might look at something like this to get your scripts generated for the flipping of the switch:
I once did the same thing and basically used the process you describe. Except of course you have to first visit each other table and add new foreign key pointing to the new column in your base table
So the approach I used was
Add a new column with an auto incrementing integer in the base table, ensure it has a unique index on it (to be replaced later by the primary key)
For each foreign key relationship pointing to the base table add a new column in the child table. (note this can result in adding more than one column in the child table if more than one relationship)
For each instance of a key in the child table enter a value into the new foreign key field(s)
Replace your foreign key relationships such that the new column now serves
Make the new column in the base table the primary
Drop the old primary key in the base table and each old foreign key in the
children.
It is doable and not as hard as it might sound at first. The crux is a series of update statements for the children table of the nature
Update child_table
set new_column = (select new_primary from base)
where old_primary = old_foreign

Copying 1-to-1 relationships with identity fields in SQL

Suppose you have two tables that are in a one-to-one relationship; i.e. the primary key of the child table is also the foreign key that links it to the parent table. Suppose also that the primary key of the parent is an identity field (a monotonically increasing integer that is assigned by the database when the record is inserted).
Suppose that you need to copy records from these two tables into a second pair of identical tables -- the primary key of the parent is an identity, and the foreign key linking the child to the parent is also the child's primary key.
How should I copy records from one set of tables to the other set?
I currently have three solutions, but I'd like to know if there are others that are better.
Option 1: Temporarily disable the
identity property in the destination
parent table. Copy records from the
parent table, then the child table,
keeping the same values for the
primary key. Cross your fingers that
there are no conflicts (value of
primary key of source table already
exists in destination table).
Option 2: Temporarily add a column to
the destination parent table to hold
the "old" (source) primary key. Copy
records from the parent table,
allowing the database to assign a new
primary key but saving the old
primary key in the temporary column.
Copy records from the child table,
joining the source child table to the
destination parent table via the the
old primary key, using the join to
insert the record into the
destination child table with the new
primary key. Drop the temporary
column from the destination parent
table.
Option 3: Copy sequentially
record-by-record, first from parent
to parent, then child to child, using
DB-provided "identity of last
inserted record" functions to ensure
that the link is maintained.
Of these options, I think option 2 is my preference. Does anyone prefer one of the other two options, and if so, why? Does anyone have a different solution that is "better"?
This is one reason why it is so critical to remember that even if you use a surrogate key (like an identity column), you always need a business key. I.e., there always need to be some other unique constraint on the table. If you had that, then another choice would be to insert the values into the copy of the parent table without the identity values and use that unique key to insert the proper parent value for the child rows.
If you do not have that unique key, then given your situation, I agree that your best solution would likely be Option #2.
Before you decide on an approach to copy data to new set of tables, you should investigate following items:
a list of tables that reference the data from the parent and child tables (both sets of tables)
Are there any stored procedures/triggers that utilize the data in these tables?
How does this table get populated? Is there an application/data feed that inserts data in this table?
How does the data in this table get deleted?
What is the purpose of the primary key beyond ensuring uniqueness in the table? For this you will have to understand how the data in the table is used by the application.
Based on the answers, you should be able to pick the right solution that will meet the requirements of the application.
My money is on Option 1 (see SET IDENTITY INSERT, http://msdn.microsoft.com/en-us/library/ms188059.aspx).
But: Why are you copying them?
If you are just altering the table schema, or migrating to new tables and retiring the old ones, why not use ALTER TABLE.
If you are going to run them side-by-side you probably need the keys to match.
But to answer your question, use Option 1, definitely.

How do I rename primary key values in Oracle?

Our application uses an Oracle 10g database where several primary keys are exposed to the end user. Productcodes and such. Unfortunately it's to late to do anything with this, as there are tons of reports and custom scripts out there that we do not have control over. We can't redefine the primary keys or mess up the database structure.
Now some customer want to change some of the primary key values. What they initially wanted to call P23A1 should now be called CAT23MOD1 (not a real example, but you get my meaning.)
Is there an easy way to do this? I would prefer a script of some sort, that could be parametrized to fit other tables and keys, but external tools would be acceptable if no other way exists.
The problem is presumably with the foreign keys that reference the PK. You must define the foreign keys as "deferrable initially immediate", as described in this Tom Kyte article: http://www.oracle.com/technology/oramag/oracle/03-nov/o63asktom.html
That lets you ...
Defer the constraints
Modify the parent value
Modify the child values
Commit the change
Simple.
Oops. A little googling makes it appear that, inexplicably, Oracle does not implement ON UPDATE CASCADE, only ON DELETE CASCADE. To find workarounds google ORACLE ON UPDATE CASCADE. Here's a link on Creating A Cascade Update Set of Tables in Oracle.
Original answer:
If I understand correctly, you want to change the values of data in primary key columns, not the actual constraint names of the keys themselves.
If this is true it can most easily be accomplished redefining ALL the foreign keys that reference the affected primary key constraint as ON UPDATE CASCADE. This means that when you make a change to the primary key value, the engine will automatically update all related values in foreign key tables.
Be aware that if this results in a lot of changes it could be prohibitively expensive in a production system.
If you have to do this on a live system with no DDL changes to the tables involved, then I think your only option is to (for each value of the PK that needs to be changed):
Insert into the parent table a copy of the row with the PK value replaced
For each child table, update the FK value to the new PK value
Delete the parent table row with the old PK value
If you have a list of parent tables and the PK values to be renamed, it shouldn't be too hard to write a procedure that does this - the information in USER_CONSTRAINTS can be used to get the FK-related tables for a given parent table.