I have several tables which are worked on within a development environment, then moved to production. If they don't already exist in production, it's fine to just generate the table creation script from SSMS and run it. However, there are occasions where the table already exists in production but all that's needed is an extra column or constraint. The problem is knowing exactly what has changed.
Is there a way to get SQL to compare my CREATE TABLE statement against the existing table and only apply what has changed? Essentially I am trying to do the below and SQL correctly complains that the table exists already.
I would have to manually write an ALTER query which on a real example would be difficult due to the sheer volume of columns. Is there a better / easier way to see what has changed? Note that this involves two separate database servers.
CREATE TABLE suppliers
( supplier_id int NOT NULL,
supplier_name char(50) NOT NULL,
contact_name char(50),
CONSTRAINT suppliers_pk PRIMARY KEY (supplier_id)
);
CREATE TABLE suppliers
( supplier_id int NOT NULL,
supplier_name char(50) NOT NULL,
contact_name char(50),
contact_number char(20), --this has been added
CONSTRAINT suppliers_pk PRIMARY KEY (supplier_id)
);
Also, dropping and recreating wouldn't be a possibility because data would be lost.
SSMS can generate the schema change script if you make the change in the table designer (right-click on the table in Object Explorer and select Design). Then, instead of applying the change immediately, from the menu select Table Designer-->Generate Change Script. Note that depending on the change, SSMS may need to recreate the table, although data will be retained. SSMS requires you uncheck the option to "prevent saving changes that require table re-creation" under Tools-->Options-->Designers-->Table and Database Designers. Review the script to make sure you're good with it.
SQL Server Data Tools (SSDT) and third-party tools (e.g. from Red-Gate and ApexSQL) have schema-compare features to generate the needed DDL after the fact. There are also features like migration scripts to facilitate continuous integration and source control integration as well. I suggest you keep database objects under source control and leverage database tooling as part of your development process.
Typically we use something like database migrations for this, as a feature outside of the database. For example, in several of our C# apps we have a tool called FluentMigrator. We write a script that adds the new columns we need in code, to the dev database. When the project is debugged, FM will run the script and modify the dev db, the dev code uses the new columns and all is well. FM knows not to run the script again
When time comes to put something live, the FM script is a part of the release, the app is put live onto the website, the migrations run again updating the live db so the live code will use the new columns and still all is well..
If there is nothing outside of your sql server (not sure how you manage that, but..), then surely you must be writing scripts (or using gui to generate scripts) that alter the DB right? So just keep those scripts and run them as part of the process of "going live"
If you are looking at this from a perspective that these db already exist created by someone else and they threw away the scripts, then you can one time catch up using a Database Schema Compare tool. Microsoft have one in SSDT - see here for more info on how it is used:
https://msdn.microsoft.com/en-us/library/hh272690(v=vs.103).aspx
If you don't have many constraints I suggest you create a dynamic script to cast and import the data into your new tables. If this doesn't fail then you just drop the old tables and rename the newly created ones.
Related
I'm trying to automate the initialising of a SQL DB on Azure. For some (lookup) tables, data needs to be copied from a source DB into the new DB each time it is initialised.
To do this I execute a query containing
SELECT * INTO [target_db_name]..[my_table_name] FROM [source_db_name].dbo.[my_table_name]
At this point an exception is thrown telling me that
Reference to database and/or server name in 'source_db_name.dbo.my_table_name'
is not supported in this version of SQL Server.
Having looked into this, I've found that it's now possible to reference another Azure SQL DB provided it has been configured as an external data source. [here and here]
So, in my target DB I've executed the following statement:
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<password>';
CREATE DATABASE SCOPED CREDENTIAL cred
WITH IDENTITY = '<username>',
SECRET = '<password>';
CREATE EXTERNAL DATA SOURCE [source_db_name]
WITH
(
TYPE=RDBMS,
LOCATION='my_location.database.windows.net',
DATABASE_NAME='source_db_name',
CREDENTIAL= cred
);
CREATE EXTERNAL TABLE [dbo].[my_table_name](
[my_column_name] BIGINT NOT NULL
)
WITH
(
DATA_SOURCE = [source_db_name],
SCHEMA_NAME = 'dbo',
OBJECT_NAME = 'my_table_name'
)
But the SELECT INTO statement still yields the same exception.
Furthermore, a simple SELECT * FROM [source_db_name].[my_table_name] yields the exception "Invalid object name 'source_db_name.my_table_name'".
What am I missing?
UPDATE
I've found the problem: CREATE EXTERNAL TABLE creates what appears to be a table in the target DB. To query this, the source DB name should not be used. So where I was failing with:
SELECT * FROM [source_db_name].[my_table_name]
I see that I should really be querying
SELECT * FROM [my_table_name]
It looks like you might need to define that external table, according to what appears to be the correct syntax:
CREATE EXTERNAL TABLE [dbo].[source_table](
...
)
WITH
(
DATA_SOURCE = source_db_name
);
The three part name approach is unsupported, except through elastic database query.
Now, since you're creating an external table, the query can pretend the external table is an object native to our [target_db]- this allows you to write the query SELECT * FROM [my_table_name], as you figured out from your edits. From the documentation, it is important to note that "This allows for read-only querying of remote databases." So, this table object is not writable, but your question only mentioned reading from it to populate a new table.
As promised, here's how I handle database deploys for SQL Server. I use the same method for on-prem, Windows Azure SQL Database, or SQL on a VM in Azure. It took a lot of pain, trial and error.
It all starts with SQL Server Data Tools, SSDT
If you're not already using SSDT to manage your database as a project separate from your applications, you need to. Grab a copy here. If you are already running a version of Visual Studio on your machine, you can get a version of SSDT specific for that version of Visual Studio. If you aren't already running VS, then you can just grab SSDT and it will install the minimal Visual Studio components to get you going.
Setting up your first Database project is easy! Start a new Database project.
Then, right click on your database project and choose Import -> Database.
Now, you can point at your current development copy of your database and import it's schema into your project. This process will pull in all the tables, views, stored procedures, functions, etc from the source database. When you're finished you will see something like the following image.
There is a folder for each schema imported, as well as a security folder for defining the schemas in your database. Explore these folders and look through the files created.
You will find all the scripts created are the CREATE scripts. This is important to remember for managing the project. You can now save your new solution, and then check it into your current source control system. This is your initial commit.
Here's the new thought process to managing your database project. As you need to make schema changes, you will come into this project to make changes to these create statements to define the state you want the object to be. You are always creating CREATE statements, never ALTER statements in your schema. Check out the example below.
Updating a table
Let's say we've decided to start tracking changes on our dbo.ETLProcess table. We will need columns to track CreatedDateTime, CreatedByID, LastUpdatedDateTime, and LastUpdatedByID. Open the dbo.ETLProcess file in the dbo\Tables folder and you'll see the current version of the table looks like this:
CREATE TABLE [dbo].[ETLProcess] (
[ETLProcessID] INT IDENTITY (1, 1) NOT NULL
, [TenantID] INT NOT NULL
, [Name] NVARCHAR (255) NULL
, [Description] NVARCHAR (1000) NULL
, [Enabled] BIT DEFAULT ((1)) NOT NULL
, CONSTRAINT [PK_ETLProcess__ETLProcessID_TenantID]
PRIMARY KEY CLUSTERED ([ETLProcessID], [TenantID])
, CONSTRAINT [FK_ETLProcess_Tenant__TenantID]
FOREIGN KEY ([TenantID])
REFERENCES [dbo].[Tenant] ([TenantID])
);
To record the change we want to make, we simply add in the columns into the table like this:
CREATE TABLE [dbo].[ETLProcess] (
[ETLProcessID] INT IDENTITY (1, 1) NOT NULL
, [TenantID] INT NOT NULL
, [Name] NVARCHAR (255) NULL
, [Description] NVARCHAR (1000) NULL
, [Enabled] BIT DEFAULT ((1)) NOT NULL
, [CreatedDateTime] DATETIME DEFAULT(GETUTCDATE())
, [CreatedByID] INT
, [LastUpdatedDateTime] DATETIME DEFAULT(GETUTCDATE())
, [LastUpdatedByID] INT
, CONSTRAINT [PK_ETLProcess__ETLProcessID_TenantID]
PRIMARY KEY CLUSTERED ([ETLProcessID], [TenantID])
, CONSTRAINT [FK_ETLProcess_Tenant__TenantID]
FOREIGN KEY ([TenantID])
REFERENCES [dbo].[Tenant] ([TenantID])
);
I didn't add any foreign keys to the definition, but if you wanted to create them, you would add them below the Foreign Key to Tenant. Once you've made the changes to the file, save it.
The next thing you'll want to get in the habit of is checking your database to make sure it's valid. In the programming world, you'd run a test build to make sure it compiles. Here, we do something very similar. From the main menu hit Build -> Build Database1 (the name of our database project).
The output window will open and tell you if there are any problems with your project. This is where you'll see things like Foreign keys referencing tables that don't yet exist, bad syntax in your create object statements, etc. You'll want to clean these up before you check your update into source control. You'll have to fix them before you will be able to deploy your changes to your development environment.
Once your database project builds successfully and it's checked in to source control, you're ready for the next change in process.
Deploying Changes
Earlier I told you it was important to remember all your schema statements are CREATE statements. Here's why: SSDT gives you two ways to deploy your changes to a target instance. Both of them use these create statements to compare your project against the target. By comparing two create statements it can generate ALTER statements needed to get a target instance up to date with your project.
The two options for deploying these changes are a T-SQL change script, or dacpac. Based on the original post, it sounds like the change script will be most familiar.
Right click on your database project and choose Schema Compare.
By default, your database project will be the source on the left. Click Select target on the right, and select the database instance you want to "upgrade". Then click Compare in the upper left, and SSDT will compare the state of your project with the target database.
You will then get a list of all the objects in your target database that are not in the project (in the DROP section), a list of all objects that are different between the project and target database (in the ALTER Section), and a list of objects that are in your project and not yet in your target database (in the ADD section).
Sometimes you'll see changes listed that you don't want to make (changes in the Casing of your object names, or the number of parenthesis around your default statements. You can deselect changes like that. Other times you will not be ready to deploy those changes in the target deployment, you can also deselect those. All items left checked will either be changed in target database, if you choose update (red box below), or added to your change script (green box below), if you hit the "Generate Script" icon.
Handling lookup data in your Database Project
Now we're finally to your original question, how do I deploy lookup data to a target database. In your database project you can right click on the project in Solution Explorer and choose Add -> New Item. You'll get a dialog box. On the left, click on User Scripts, then on the right, choose Post-Deployment Script.
By adding a script of this type, SSDT knows you want to run this step after any schema changes. This is where you will enter your lookup values, as a result they're included in source control!
Now here's a very important note about these post deployment scripts. You need to be sure any T-SQL you add here will work if you call the script in a new database, in an existing database, or if you called it 100 times in a row. As a result of this requirement, I've taken to including all my lookup values in merge statements. That way I can handle inserts, updates, and deletes.
Before committing this file to source control, test it in all three scenarios above to be sure it won't fail.
Wrapping it all up
Moving from making changes directly in your target environments to using SSDT and source controlling your changes is a big step in the maturation of your software development life-cycle. The good news is it makes you think about your database as part of the deployment process in a way that is compatible with continuous integration/continuous deployment methods.
Once you get used to the new process, you can then learn how to add a dacpac generated from SSDT into your deployment scripts and have the changes pushed at just the right time in your deployment.
It also frees you from your SELECT INTO problem, your original problem.
I need help writing a TSQL script to modify two columns' data type.
We are changing two columns:
uniqueidentifier -> varchar(36) * * * has a primary key constraint
xml -> nvarchar(4000)
My main concern is production deployment of the script...
The table is actively used by a public website that gets thousands of hits per hour. Consequently, we need the script to run quickly, without affecting service on the front end. Also, we need to be able to automatically rollback the transaction if an error occurs.
Fortunately, the table only contains about 25 rows, so I am guessing the update will be quick.
This database is SQL Server 2005.
(FYI - the type changes are required because of a 3rd-party tool which is not compatible with SQL Server's xml and uniqueidentifier types. We've already tested the change in dev and there are no functional issues with the change.)
As David said, execute a script in a production database without doing a backup or stop the site is not the best idea, that said, if you want to do changes in only one table with a reduced number of rows you can prepare a script to :
Begin transaction
create a new table with the final
structure you want.
Copy the data from the original table
to the new table
Rename the old table to, for example,
original_name_old
Rename the new table to
original_table_name
End transaction
This will end with a table that is named as the original one but with the new structure you want, and in addition you maintain the original table with a backup name, so if you want to rollback the change you can create a script to do a simple drop of the new table and rename of the original one.
If the table has foreign keys the script will be a little more complicated, but is still possible without much work.
Consequently, we need the script to
run quickly, without affecting service
on the front end.
This is just an opinion, but it's based on experience: That's a bad idea. It's better to have a short, (pre-announced if possible) scheduled downtime than to take the risk.
The only exception is if you really don't care if the data in these tables gets corrupted, and you can be down for an extended period.
In this situation, based on th types of changes you're making and the testing you've already performed, it sounds like the risk is very minimal, since you've tested the changes and you SHOULD be able to do it safely, but nothing is guaranteed.
First, you need to have a fall-back plan in case something goes wrong. The short version of a MINIMAL reasonable plan would include:
Shut down the website
Make a backup of the database
Run your script
test the DB for integrity
bring the website back online
It would be very unwise to attempt to make such an update while the website is live. you run the risk of being down for an extended period if something goes wrong.
A GOOD plan would also have you testing this against a copy of the database and a copy of the website (a test/staging environment) first and then taking the steps outlined above for the live server update. You have already done this. Kudos to you!
There are even better methods for making such an update, but the trade-off of down time for safety is a no-brainer in most cases.
And if you absolutely need to do this in live then you might consider this:
1) Build an offline version of the table with the new datatypes and copied data.
2) Build all the required keys and indexes on the offline tables.
3) swap the tables out in a transaction. 00 you could rename the old table to something else as an emergency backup.
sp_help 'sp_rename'
But TEST FIRST all of this in a prod like environment. And make sure your backups are up to date. AND do this when you are least busy.
I create 2 tables and another 1 with foreign keys to the other two.
I realized I want to make some changes to table no 3.
I try to update a field but I get an error "Saving changes is not permitted. The changes you have made require the following table to be dropped and re-created."
I delete those 2 relationships but when I look at dependencies I see my table still depends on those 2 and I still cannot make any change to it.
What can I do?
You can also enable saving changes that require dropping of tables by going to "tools->options->designers->Table and database designers" and unchecking "Prevent saving changes that require table re-creation"
Be careful with this though, sometimes it'll drop a table without being able to recreate it, which makes you lose all data that was in the table.
When using Microsoft SQL Server Management Studio 2012, the same message occurs.
I used the script feature to do modifications which can be seen as a rather good workaround if you wanna use the designer only within a "safe" mode.
Especially the GUI related to create a foreign key is not the best in my opinion. When using a script (alter table) for adding a fk, you are faster than using this GUI feature.
When adding/writing a 'not' in prior to null, that's not a hard issue. (Removing an 'Allow Nulls' for a column refers to "Saving changes is not permitted" when using the designer.)
There are several questions on SO about version control for SQL and lots of resources on the web, but I can't find something that quite covers what I'm trying to do.
First off, I'm talking about a methodology here. I'm familiar with the various source control applications out there and I'm familiar with tools like Red Gate's SQL Compare, etc. and I know how to write an application to check things in and out of my source control system automatically. If there is a tool which would be particularly helpful in providing a whole new methodology or which have a useful and uncommon functionality then great, but for the tasks mentioned above I'm already set.
The requirements that I'm trying to meet are:
The database schema and look-up table data are versioned
DML scripts for data fixes to larger tables are versioned
A server can be promoted from version N to version N + X where X may not always be 1
Code isn't duplicated within the version control system - for example, if I add a column to a table I don't want to have to make sure that the change is in both a create script and an alter script
The system needs to support multiple clients who are at various versions for the application (trying to get them all up to within 1 or 2 releases, but not there yet)
Some organizations keep incremental change scripts in their version control and to get from version N to N + 3 you would have to run scripts for N->N+1 then N+1->N+2 then N+2->N+3. Some of these scripts can be repetitive (for example, a column is added but then later it is altered to change the data type). We're trying to avoid that repetitiveness since some of the client DBs can be very large, so these changes might take longer than necessary.
Some organizations will simply keep a full database build script at each version level then use a tool like SQL Compare to bring a database up to one of those versions. The problem here is that intermixing DML scripts can be a problem. Imagine a scenario where I add a column, use a DML script to fill said column, then in a later version that column name is changed.
Perhaps there is some hybrid solution? Maybe I'm just asking for too much? Any ideas or suggestions would be greatly appreciated though.
If the moderators think that this would be more appropriate as a community wiki, please let me know.
Thanks!
I struggled with this for several years before recently adopting a strategy that seems to work pretty well. Key points I live by:
The database doesn't need to be independently versioned from the app
All database update scripts should be idempotent
As a result, I no longer create any kind of version tables. I simply add changes to a numbered sequence of .sql files that can be applied at any given time without corrupting the database. If it makes things easier, I'll write a simple installer screen for the app to allow administrators to run these scripts whenever they like.
Of course, this method does impose a few requirements on the database design:
All schema changes are done through script - no GUI work.
Extra care must be taken to ensure all keys, constraints, etc.. are named so they can be referenced by a later update script, if necessary.
All update scripts should check for existing conditions.
Examples from a recent project:
001.sql:
if object_id(N'dbo.Registrations') is null
begin
create table dbo.Registrations
(
[Id] uniqueidentifier not null,
[SourceA] nvarchar(50) null,
[SourceB] nvarchar(50) null,
[Title] nvarchar(50) not null,
[Occupation] nvarchar(50) not null,
[EmailAddress] nvarchar(100) not null,
[FirstName] nvarchar(50) not null,
[LastName] nvarchar(50) not null,
[ClinicName] nvarchar(200) not null,
[ClinicAddress] nvarchar(50) not null,
[ClinicCity] nvarchar(50) not null,
[ClinicState] nchar(2) not null,
[ClinicPostal] nvarchar(10) not null,
[ClinicPhoneNumber] nvarchar(10) not null,
[ClinicPhoneExtension] nvarchar(10) not null,
[ClinicFaxNumber] nvarchar(10) not null,
[NumberOfVets] int not null,
[IpAddress] nvarchar(20) not null,
[MailOptIn] bit not null,
[EmailOptIn] bit not null,
[Created] datetime not null,
[Modified] datetime not null,
[Deleted] datetime null
);
end
if not exists(select 1 from information_schema.table_constraints where constraint_name = 'pk_registrations')
alter table dbo.Registrations add
constraint pk_registrations primary key nonclustered (Id);
if not exists (select 1 from sysindexes where [name] = 'ix_registrations_created')
create clustered index ix_registrations_created
on dbo.Registrations(Created);
if not exists (select 1 from sysindexes where [name] = 'ix_registrations_email')
create index ix_registrations_email
on dbo.Registrations(EmailAddress);
if not exists (select 1 from sysindexes where [name] = 'ix_registrations_email')
create index ix_registrations_name_and_clinic
on dbo.Registrations (FirstName,
LastName,
ClinicName);
002.sql
/**********************************************************************
The original schema allowed null for these columns, but we don't want
that, so update existing nulls and change the columns to disallow
null values
*********************************************************************/
update dbo.Registrations set SourceA = '' where SourceA is null;
update dbo.Registrations set SourceB = '' where SourceB is null;
alter table dbo.Registrations alter column SourceA nvarchar(50) not null;
alter table dbo.Registrations alter column SourceB nvarchar(50) not null;
/**********************************************************************
The client wanted to modify the signup form to include a fax opt-in
*********************************************************************/
if not exists
(
select 1
from information_schema.columns
where table_schema = 'dbo'
and table_name = 'Registrations'
and column_name = 'FaxOptIn'
)
alter table dbo.Registrations
add FaxOptIn bit null
constraint df_registrations_faxoptin default 0;
003.sql, 004.sql, etc...
At any given time I can run the entire series of scripts against the database in any state and know that things will be immediately brought up to speed with the current version of the app. Because everything is scripted, it's much easier to build a simple installer to do this, and it's adding the schema changes to source control is no problem at all.
You've got quite a rigorous set of requirements, I'm not sure whether you'll find something that puts checks in all the boxes, especially the multiple concurrent schemas and the intelligent version control.
The most promising tool that I've read about that kind of fits is Liquibase.
Here are some additional links:
http://en.wikipedia.org/wiki/LiquiBase
http://www.ibm.com/developerworks/java/library/j-ap08058/index.html
Yes, you're asking for a lot, but they're all really pertinent points! Here at Red Gate we're moving towards a complete database development solution with our SQL Source Control SSMS extension and we're facing similar challenges.
http://www.red-gate.com/products/SQL_Source_Control/index.htm
For the upcoming release we're fully supporting schema changes, and supporting static data indirectly via our SQL Data Compare tool. All changes are saved as creation scripts, although when you're updating or deploying to a database, the tool will ensure that the changes are applied appropriately as an ALTER or CREATE.
The most challenging requirement that doesn't yet have a simple solution is version management and deployment, which you describe very clearly. If you make complex changes to the schema and data, it may be inevitable that a handcrafted migration script is constructed to get between two adjacent versions, as not all of the 'intent' is always saved alongside a newer version. Column renames are a prime example. The solution could be for a system to be devised that saves the intent, or if this is too complex, allows the user to supply a custom script to perform the complex change. Some sort of version management framework would manage these and "magically" construct deployment scripts from two arbitrary versions.
for this kind of issue use Visual studio team system 2008 for version controlling of your sql database.
In tsf there are no. of feature avialbe like
Datacompare
Schemacompare
version controlling
about database version control : http://www.codinghorror.com/blog/2006/12/is-your-database-under-version-control.html
for more detail check : http://msdn.microsoft.com/en-us/library/ms364062(VS.80).aspx
We are using SQL Examiner for keeping database schema under version control. I've tried the VS2010 also, but in my opinion VS approach is too complex for small and mid-size projects. With SQL Examiner I mostly work with SSMS and use SQL Examiner to check-in updates to SVN (TFS and SourceSafe is supported also, but I never tried it).
Here is description of SQL Examiner's approach: How to get your database under version control
Try DBSourceTools. (http://dbsourcetools.codeplex.com)
Its open source, and specifically designed to script an entire database - tables, views, procs to disk, and then re-create that database through a deployment target.
You can script all data, or just specify which tables to script data for.
Additionally, you can zip up the results for distribution.
We use it for source control of databases, and to test update patches for new releases.
In the back-end it's built around SMO, and thus supports SQL 2000, 2005 and 2008.
DBDiff is integrated, to allow for schema comparisons.
Have fun,
- Nathan.
I add a column of type tinyint and being set to not allow nulls in a table and generate the change scripts. The table has data in it at this time. The script has code that creates a temp table and inserts the data that is in the current table into. It then deletes the old table and renames this temp table to the same name as the original table. All fine and good. My question is, why if I do the same thing to another table (same field, but different table), the generate change script does not include this new table insertion code?
Any tips would be greatly appreciated!
If the table does not contain data, there is no need to rebuild the table. Essentially Management Studio "plays it safe" behind the scenes by generating the script this way if it thinks it can't do it simply by just modifying the table. In my experience, it often does this when it doesn't really need to, however there are exceptions ... for example if you add your column not at the "end" of the table. Rather than make changes in the UI and script them, I recommend becoming familiar with the ALTER TABLE command. Rebuilding the table in that manner can be catastrophic on a production system, and can usually be avoided.