SQL, How to change column in SQL table without breaking other dependencies? - sql

I'm sure this might be quite common query but couldn't find good answer as for now.
Here is my question:
I've got a table named Contacts with varchar column Title. Now in the middle of development I want to replace field Title with TitleID which is foreign key to ContactTitles table. At the moment table Contacts has over 60 dependencies (other tables, views functions).
How can I do that the safest and easiest way?
We use: MSSQL 2005, data has already been migrated, just want to change schema.
Edit:
Thanks to All for quick replay.
Like it was mentioned Contacts table has over 60 dependents, but when following query was run, only 5 of them use Title column. Migration script was run, so no data changes required.
/*gets all objects which use specified column */
SELECT Name
FROM syscomments sc
JOIN sysobjects so ON sc.id = so.id
WHERE TEXT LIKE '%Title%' AND TEXT LIKE '%TitleID%'
Then I went through those 5 views and updated them manually.

Use refactoring methods. Start off by creating a new field called TitleID, then copy all the titles into the ContactTitles table. Then, one by one, update each of the dependencies to use the TitleID field. Just make sure you've still got a working system after each step.
If the data is going to be changing, you'll have to be careful and make sure that any changes to the Title column also change the ContactTitles table. You'll only have to keep them in sync while you're doing the refactoring.
Edit: There's even a book about it! Refactoring Databases.

As others pointed out it depends on your RDBMS.
There are two approaches:
make a change to the table and fix all dependencies
make a view that you can use instead of direct access to the table (this can guard you against future changes in the underlying core table(s), but you might loose some update functionality, depending on your DBMS)

For Microsoft SQL Server Redgate have a (not free) product that can help with this refactoring http://www.red-gate.com/products/sql_refactor/index.htm
In the past I have managed to do this quite easily (if primitively) by simply getting a list of things to review
SELECT * FROM sys.objects
WHERE OBJECT_DEFINITION(OBJECT_ID) LIKE '%Contacts%'
(and possibly taking dependencies information into account and filtering by object type)
Scripting all the ones of interest in Management Studio then simply going down the list and reviewing them all and changing the CREATE to ALTER. It should be quite a simple and repetitive change even for 60 possible dependencies. Additionally if you are referring to a non existent column you should get an error message when you run the script to ALTER.
If you use * in your queries or adhoc SQL in your applications obviously things may be a bit more difficult.

Use SP_Depend 'Table Name' to check the Dependencies of the table
and then Use the SP_Rename to Rename the Column Name which is very useful.
sp_rename automatically renames the associated index whenever a PRIMARY KEY or UNIQUE constraint is renamed. If a renamed index is tied to a PRIMARY KEY constraint, the PRIMARY KEY constraint is also automatically renamed by sp_rename.
and then start Updating the Procedure and Functions one by one there is no other good option for change like this if you found then tell me too.

Related

Cast 'new' and 'old' dynamically [duplicate]

I'm interested in using the following audit mechanism in an existing PostgreSQL database.
http://wiki.postgresql.org/wiki/Audit_trigger
but, would like (if possible) to make one modification. I would also like to log the primary_key's value where it could be queried later. So, I would like to add a field named something like "record_id" to the "logged_actions" table. The problem is that every table in the existing database has a different primary key fieldname. The good news is that the database has a very consistent naming convention. It's always, _id. So, if a table was named "employee", the primary key is "employee_id".
Is there anyway to do this? basically, I need something like OLD.FieldByName(x) or OLD[x] to get value out of the id field to put into the record_id field in the new audit record.
I do understand that I could just create a separate, custom trigger for each table that I want to keep track of, but it would be nice to have it be generic.
edit: I also understand that the key value does get logged in either the old/new data fields. But, what I would like would be to make querying for the history easier and more efficient. In other words,
select * from audit.logged_actions where table_name = 'xxxx' and record_id = 12345;
another edit: I'm using PostgreSQL 9.1
Thanks!
You didn't mention your version of PostgreSQL, which is very important when writing answers to questions like this.
If you're running PostgreSQL 9.0 or newer (or able to upgrade) you can use this approach as documented by Pavel:
http://okbob.blogspot.com/2009/10/dynamic-access-to-record-fields-in.html
In general, what you want is to reference a dynamically named field in a record-typed PL/PgSQL variable like 'NEW' or 'OLD'. This has historically been annoyingly hard, and is still awkward but is at least possible in 9.0.
Your other alternative - which may be simpler - is to write your audit triggers in plperlu, where dynamic field references are trivial.

general useable alias names for tables and columns

if we define tables with columns we have to use company specific naming conventions.
new employees have often problems in understanding all this table and columnnames.
so i had the idea that it would be great if we could define that a table 'Customer' could always be referenced by 'Kunde' and the
Column 'SPR' referenced by 'spezialPreis'.
And i want to define these alias names in the database schema so that nobody has to know the oldstyled orginal names.
is something like this possible?
special interessted is a solution with ms sql server.
additional Information: the main goal is not to bypass a naming convention. it is to let old application work with old namings and let us make new ones with new and better understanding namings.
additional we can't use views, because we want to use it in all statements. insert, update, delete, alter, grant,... what ever..
I would suggest adding a computed column:
alter table customer add kunde as (old_column_name_here)
You cannot readily remove the old name, unless you use a view. But this at least adds the new name, so you can migrate to using it (and, perhaps, eventually rename the old name to kunde).

SQL Server - Schema/Code Analysis Rules - What would your rules include?

We're using Visual Studio Database Edition (DBPro) to manage our schema. This is a great tool that, among the many things it can do, can analyse our schema and T-SQL code based on rules (much like what FxCop does with C# code), and flag certain things as warnings and errors.
Some example rules might be that every table must have a primary key, no underscore's in column names, every stored procedure must have comments etc.
The number of rules built into DBPro is fairly small, and a bit odd. Fortunately DBPro has an API that allows the developer to create their own. I'm curious as to the types of rules you and your DB team would create (both schema rules and T-SQL rules). Looking at some of your rules might help us decide what we should consider.
Thanks - Randy
Some of mine. Not all could be tested programmatically:
No hungarian-style prefixes (like "tbl" for table, "vw" for view)
If there is any chance this would ever be ported to Oracle, no identifiers longer than 30 characters.
All table and column names expressed in lower-case letters only
Underscores between words in column and table names--we differ on this one obviously
Table names are singular ("customer" not "customers")
Words that make up table, column, and view names are not abbreviated, concatenated, or acronym-based unless necessary.
Indexes will be prefixed with “IX_”.
Primary Keys are prefixed with “PK_”.
Foreign Keys are prefixed with “FK_”.
Unique Constraints are prefixed with “UC_”.
I suspect most of my list would be hard to put in a rules engine, but here goes:
If possible I'd have it report any tables that are defined as wider than the bytes that can be stored in a record (excluding varchar(max) and text type fields) and/or a datapage.
I want all related PK and FK columns to have the same name if at all possible. The only time it isn't possible is when you need to have two FKs in the same table relating to one PK and even then, I would name it the name of the PK and a prefix or suffix describing the difference. For instance if I had a PersonID PK and a table needed to have both the sales rep id and the customer id, they would be CustomerPersonID, and RepPersonID.
I would check to make sure all FKs have an index.
I would want to know about all fields that are required but have no default value. Depending on what it is, you may not want to define a default, But I would want to be able to easily see which ones don't to hopefully find the ones that should have a default.
I would want all triggers checked to see that they are set-based and not designed to run for one row at time.
No table without a defined Unique index or PK. No table where the PK is more than one field. No table where the PK is not an int.
No object names that use reserved words for the database I'm using.
No fields with the word Date as part of the name that are not defined as date or datetime.
No table without an associated audit table.
No field called SSN, SocialSecurityNumber, etc. that is not encrypted. Same for any field named CreditCardNumber.
No user defined datatypes (In SQL Server at least, these are far more trouble than they are worth.)
No views that call other views. Experience has shown me these are often a performance disaster waiting to happen. Especially if they layer more than one layer deep.
If using replication, no table without a GUID field.
All tables should have a DateInserted field and InsertedBy field (even with auditing, it is often easier to research data problems if this info is easily available.)
Consistent use of the same case in naming. It doesn't matter which as long as all use the same one.
No tables with a field called ID. Hate these with a passion. They are so useless. ID fields should be named tablenameID if a PK and with the PK name if an FK.
No spaces or special characters in object names. In other words if you need special handling for the database to recognize it in the correct context in query, don't use it.
If it is going to analyze code as well, I'd want to see any code that uses a cursor or a correlated subquery. Why create performance problems from the start?
I would want to see if a proc uses dynamic SQl and if so if it has an input bit variable called Debug (and code to only print the dynamic SQl statment and not execute it, if the Debug variable is set to 1).
I'd want to be able to check that if there is more than one statement causing action in the database (insert/update/delete) that there is also an explicit transaction in the proc and error trapping to roll the whole thing back if any part of it fails.
I'm sure I could think of more.

Designing a database schema for a Job Listing website?

For a school project I'm making a simple Job Listing website in ASP.NET MVC (we got to choose the framework).
I've thought about it awhile and this is my initial schema:
JobPostings
+---JobPostingID
+---UserID
+---Company
+---JobTitle
+---JobTypeID
+---JobLocationID
+---Description
+---HowToApply
+---CompanyURL
+---LogoURL
JobLocations
+---JobLocationID
+---City
+---State
+---Zip
JobTypes
+---JobTypeID
+---JobTypeName
Note: the UserID will be linked to a Member table generated by a MembershipProvider.
Now, I am extremely new to relational databases and SQL so go lightly on me.
What about naming? Should it be just "Description" under the JobPostings table, or should it be "JobDescription" (same with other columns in that main table). Should it be "JobPostingID" or just "ID"?
General tips are appreciated as well.
Edit: The JobTypes are fixed for our project, there will be 15 job categories. I've made this a community wiki to encourage people to post.
A few thoughts:
Unless you know a priori that there is a limited list of job types, don't split that into a separate table;
Just use "ID" as the primary key on each table (we already know it's a JobLocationID, because it's in the JobLocations table...);
I'd drop the 'Job' prefix from the fields in JobPostings, as it's a bit redundant.
There's a load of domain-specific info that you could include, like salary ranges, and applicant details, but I don't know how far you're supposed to be going with this.
Job Schema http://gfilter.net/junk/JobSchema.png
I split Company out of Job Posting, as this makes maintaining the companies easier.
I also added a XREF table that can store the relationship between companies and locations. You can setup a row for each company office, and have a very simple way to find "Alternative Job Locations for this Company".
This should be a fun project...good luck.
EDIT: I would add Created and LastModifiedBy (Referring to a UserID). These are great columns for general housekeeping.
Looks good to me, I would recommend also adding Created, LastModified and Deleted columns to the user updateable tables as well for future proofing.
Make sure you explicitly define your primary and foreign keys as well in your schema.
What about naming? Should it be just
"Description" under the JobPostings
table, or should it be JobDescription
(same with other columns in that main
table). Should it be "JobPostingID" or
just "ID"?
Personally, I specify generic-sounding fields like "ID" and "Description" with prefixes as you suggest. It avoids confusion about what the id/description applies to when you write queries later on (and saves you the trouble of aliasing them).
I'd recommend folding the data you're going to be storing in JobLocations back into the main table. It's ok to have a table for states and another for countries, but I doubt you want a table that contains every city/state/country pair, you really don't gain anything from it. What happens if someone goes in and edits their location? You'd have to check to make sure no other joblisting points to the location and edit it, else create a new location and point to that instead.
My usual pattern is address and city as text with the record and FK to a state table.

Copying relational data from database to database

Edit: Let me completely rephrase this, because I'm not sure there's an XML way like I was originally describing.
Yet another edit: This needs to be a repeatable process, and it has to be able to be set up in a way that it can be called in C# code.
In database A, I have a set of tables, related by PKs and FKs. A parent table, with child and grandchild tables, let's say.
I want to copy a set of rows from database A to database B, which has identically named tables and fields. For each table, I want to insert into the same table in database B. But I can't be constrained to use the same primary keys. The copy routine must create new PKs for each row in database B, and must propagate those to the child rows. I'm keeping the same relations between the data, in other words, but not the same exact PKs and FKs.
How would you solve this? I'm open to suggestions. SSIS isn't completely ruled out, but it doesn't look to me like it'll do this exact thing. I'm also open to a solution in LINQ, or using typed DataSets, or using some XML thing, or just about anything that'll work in SQL Server 2005 and/or C# (.NET 3.5). The best solution wouldn't require SSIS, and wouldn't require writing a lot of code. But I'll concede that this "best" solution may not exist.
(I didn't make this task up myself, nor the constraints; this is how it was given to me.)
I think the SQL Server utility tablediff.exe might be what you are looking for.
See also this thread.
First, let me say that SSIS is your best bet. But, to answer the question you asked...
I don't believe you will be able to get away with creating new id's all around, although you could but you would need to take the original IDs to use for lookups.
The best you can get is one insert statement for table. Here is an example of the code to do SELECTs to get you the data from your XML Sample:
declare #xml xml
set #xml='<People Key="1" FirstName="Bob" LastName="Smith">
<PeopleAddresses PeopleKey="1" AddressesKey="1">
<Addresses Key="1" Street="123 Main" City="St Louis" State="MO" ZIP="12345" />
</PeopleAddresses>
</People>
<People Key="2" FirstName="Harry" LastName="Jones">
<PeopleAddresses PeopleKey="2" AddressesKey="2">
<Addresses Key="2" Street="555 E 5th St" City="Chicago" State="IL" ZIP="23456" />
</PeopleAddresses>
</People>
<People Key="3" FirstName="Sally" LastName="Smith">
<PeopleAddresses PeopleKey="3" AddressesKey="1">
<Addresses Key="1" Street="123 Main" City="St Louis" State="MO" ZIP="12345" />
</PeopleAddresses>
</People>
<People Key="4" FirstName="Sara" LastName="Jones">
<PeopleAddresses PeopleKey="4" AddressesKey="2">
<Addresses Key="2" Street="555 E 5th St" City="Chicago" State="IL" ZIP="23456" />
</PeopleAddresses>
</People>
'
select t.b.value('./#Key', 'int') PeopleKey,
t.b.value('./#FirstName', 'nvarchar(50)') FirstName,
t.b.value('./#LastName', 'nvarchar(50)') LastName
from #xml.nodes('//People') t(b)
select t.b.value('../../#Key', 'int') PeopleKey,
t.b.value('./#Street', 'nvarchar(50)') Street,
t.b.value('./#City', 'nvarchar(50)') City,
t.b.value('./#State', 'char(2)') [State],
t.b.value('./#Zip', 'char(5)') Zip
from
#xml.nodes('//Addresses') t(b)
What this does is take Nodes from the XML and parse out the data. To get the relational id from people we use ../../ to go up the chain.
Dump the XML approach and use the import wizard / SSIS.
By far the easiest way is Red Gate's SQL Data Compare. You can set it up to do just what you described in a minute or two.
I love Red Gate's SQL Compare and Data Compare too but it won't meet his requirements for the changing primary keys as far as I can tell.
If cross database queries/linked servers are an option you could do this with a stored procedure that copies the records from parent/child in DB A into temporary tables on DB B and then add a column for the new primary key in the temp child table that you would update after inserting the headers.
My question is if the records don't have the same primary key how do you tell if it's a new record? Is there some other candidate key? If these are new tables why can't they have the same primary key?
I have created the same thing with a set of stored procedures.
Database B will have its own primary keys, but store Database A's primary keys, for debuging purposes. It means I can have more than one Database A!
Data is copied via a linked server. Not too fast; SSIS is faster. But SSIS is not for beginners, and it is not easy to code something that works with changing source tables.
And it is easy to call a stored procedure from C#.
I'd script it in a Stored Procedure, using Inserts to do the hard work. Your code will take the PKs from Table A (presumably via ##Scope_Identity) - I assume that the PK for Table A is an Identity field?
You could use temporary tables, cursors or you might prefer to use the CLR - it might lend itself to this kind of operation.
I'd be surprised to find a tool that could do this off the shelf with either a) pre-determined keys, or b) identity fields (clearly Tables B & C don't have them).
Are you clearing the destination tables each time and then starting again? That will make a big difference to the solution you need to implement. If you are doing a complete re-import each time then you could do something like the following:
Create a temporary table or table variable to record the old and new primary keys for the parent table.
Insert the parent table data into the destination and use the OUTPUT clause to capture the new ID's and insert them with the old IDs into the temp table.
NOTE: Using the output clause is efficient and allows you to do the insert in bulk without cycling through each record to be inserted.
Insert the child table data. Join to the temp table to retrieve the new foreign key required.
The above process could be done using T-SQL Script, C# code or SSIS. My preference would be for SSIS.
If you are adding each time then you may need to keep a permanent table to track the relationship between source database primary keys and destination database primary keys (at least for the parent table). If you needed to keep this kind of data out of the destination database, you could get SSIS to store/retrieve it from some kind of logging database or even a flat file.
You could probably avoid the above scenario if there is a combination of fields in the parent table that can be used to uniquely identify that record and therefore "find" the primary key for that record in the destination database.
I think most likely what I'm going to use is typed datasets. It won't be a generalized solution; we'll have to regenerate them if any of the tables change. But based on what I've been told, that's not a problem; the tables aren't expected to change much.
Datasets will make it reasonably easy to loop through the data hierarchically and refresh PKs from the database after insert.
When dealing with similar tasks I simply created a set of stored procedures to do the job.
As the task that you specified is pretty custom, you are not likely to find "ready to use" solution.
Just to give you some hints:
If the databases are on different servers use linked servers so you can access both source and destination tables simply through TSQL
In the stored procedure:
Identify the parent items that need to be copied - you said that the primary keys are different so you need to use unique constraints instead (you should be able to define them if the tables are normalised)
Identify the child items that need to be copied based on the identified parents, to check if some of them are already in the destination db use the unique constraints approach again
Identify the grandchild items (same logic as with parent-child)
Copy data over starting with the lowest level (grandchildren, children, parents)
There is no need for cursors etc, simply store the immediate results in the temporary table (or table variable if working within one stored procedure)
That approach worked for me pretty well.
You can of course add parameter to the main stored procedure so you can either copy all new records or only ones that you specify.
Let me know if that is of any help.