SQL version control methodology

SQL version control methodology - sql

There are several questions on SO about version control for SQL and lots of resources on the web, but I can't find something that quite covers what I'm trying to do.
First off, I'm talking about a methodology here. I'm familiar with the various source control applications out there and I'm familiar with tools like Red Gate's SQL Compare, etc. and I know how to write an application to check things in and out of my source control system automatically. If there is a tool which would be particularly helpful in providing a whole new methodology or which have a useful and uncommon functionality then great, but for the tasks mentioned above I'm already set.
The requirements that I'm trying to meet are:
The database schema and look-up table data are versioned
DML scripts for data fixes to larger tables are versioned
A server can be promoted from version N to version N + X where X may not always be 1
Code isn't duplicated within the version control system - for example, if I add a column to a table I don't want to have to make sure that the change is in both a create script and an alter script
The system needs to support multiple clients who are at various versions for the application (trying to get them all up to within 1 or 2 releases, but not there yet)
Some organizations keep incremental change scripts in their version control and to get from version N to N + 3 you would have to run scripts for N->N+1 then N+1->N+2 then N+2->N+3. Some of these scripts can be repetitive (for example, a column is added but then later it is altered to change the data type). We're trying to avoid that repetitiveness since some of the client DBs can be very large, so these changes might take longer than necessary.
Some organizations will simply keep a full database build script at each version level then use a tool like SQL Compare to bring a database up to one of those versions. The problem here is that intermixing DML scripts can be a problem. Imagine a scenario where I add a column, use a DML script to fill said column, then in a later version that column name is changed.
Perhaps there is some hybrid solution? Maybe I'm just asking for too much? Any ideas or suggestions would be greatly appreciated though.
If the moderators think that this would be more appropriate as a community wiki, please let me know.
Thanks!

I struggled with this for several years before recently adopting a strategy that seems to work pretty well. Key points I live by:
The database doesn't need to be independently versioned from the app
All database update scripts should be idempotent
As a result, I no longer create any kind of version tables. I simply add changes to a numbered sequence of .sql files that can be applied at any given time without corrupting the database. If it makes things easier, I'll write a simple installer screen for the app to allow administrators to run these scripts whenever they like.
Of course, this method does impose a few requirements on the database design:
All schema changes are done through script - no GUI work.
Extra care must be taken to ensure all keys, constraints, etc.. are named so they can be referenced by a later update script, if necessary.
All update scripts should check for existing conditions.
Examples from a recent project:
001.sql:
if object_id(N'dbo.Registrations') is null
begin
create table dbo.Registrations
(
[Id] uniqueidentifier not null,
[SourceA] nvarchar(50) null,
[SourceB] nvarchar(50) null,
[Title] nvarchar(50) not null,
[Occupation] nvarchar(50) not null,
[EmailAddress] nvarchar(100) not null,
[FirstName] nvarchar(50) not null,
[LastName] nvarchar(50) not null,
[ClinicName] nvarchar(200) not null,
[ClinicAddress] nvarchar(50) not null,
[ClinicCity] nvarchar(50) not null,
[ClinicState] nchar(2) not null,
[ClinicPostal] nvarchar(10) not null,
[ClinicPhoneNumber] nvarchar(10) not null,
[ClinicPhoneExtension] nvarchar(10) not null,
[ClinicFaxNumber] nvarchar(10) not null,
[NumberOfVets] int not null,
[IpAddress] nvarchar(20) not null,
[MailOptIn] bit not null,
[EmailOptIn] bit not null,
[Created] datetime not null,
[Modified] datetime not null,
[Deleted] datetime null
);
end
if not exists(select 1 from information_schema.table_constraints where constraint_name = 'pk_registrations')
alter table dbo.Registrations add
constraint pk_registrations primary key nonclustered (Id);
if not exists (select 1 from sysindexes where [name] = 'ix_registrations_created')
create clustered index ix_registrations_created
on dbo.Registrations(Created);
if not exists (select 1 from sysindexes where [name] = 'ix_registrations_email')
create index ix_registrations_email
on dbo.Registrations(EmailAddress);
if not exists (select 1 from sysindexes where [name] = 'ix_registrations_email')
create index ix_registrations_name_and_clinic
on dbo.Registrations (FirstName,
LastName,
ClinicName);
002.sql
/**********************************************************************
The original schema allowed null for these columns, but we don't want
that, so update existing nulls and change the columns to disallow
null values
*********************************************************************/
update dbo.Registrations set SourceA = '' where SourceA is null;
update dbo.Registrations set SourceB = '' where SourceB is null;
alter table dbo.Registrations alter column SourceA nvarchar(50) not null;
alter table dbo.Registrations alter column SourceB nvarchar(50) not null;
/**********************************************************************
The client wanted to modify the signup form to include a fax opt-in
*********************************************************************/
if not exists
(
select 1
from information_schema.columns
where table_schema = 'dbo'
and table_name = 'Registrations'
and column_name = 'FaxOptIn'
)
alter table dbo.Registrations
add FaxOptIn bit null
constraint df_registrations_faxoptin default 0;
003.sql, 004.sql, etc...
At any given time I can run the entire series of scripts against the database in any state and know that things will be immediately brought up to speed with the current version of the app. Because everything is scripted, it's much easier to build a simple installer to do this, and it's adding the schema changes to source control is no problem at all.

You've got quite a rigorous set of requirements, I'm not sure whether you'll find something that puts checks in all the boxes, especially the multiple concurrent schemas and the intelligent version control.
The most promising tool that I've read about that kind of fits is Liquibase.
Here are some additional links:
http://en.wikipedia.org/wiki/LiquiBase
http://www.ibm.com/developerworks/java/library/j-ap08058/index.html

Yes, you're asking for a lot, but they're all really pertinent points! Here at Red Gate we're moving towards a complete database development solution with our SQL Source Control SSMS extension and we're facing similar challenges.
http://www.red-gate.com/products/SQL_Source_Control/index.htm
For the upcoming release we're fully supporting schema changes, and supporting static data indirectly via our SQL Data Compare tool. All changes are saved as creation scripts, although when you're updating or deploying to a database, the tool will ensure that the changes are applied appropriately as an ALTER or CREATE.
The most challenging requirement that doesn't yet have a simple solution is version management and deployment, which you describe very clearly. If you make complex changes to the schema and data, it may be inevitable that a handcrafted migration script is constructed to get between two adjacent versions, as not all of the 'intent' is always saved alongside a newer version. Column renames are a prime example. The solution could be for a system to be devised that saves the intent, or if this is too complex, allows the user to supply a custom script to perform the complex change. Some sort of version management framework would manage these and "magically" construct deployment scripts from two arbitrary versions.

for this kind of issue use Visual studio team system 2008 for version controlling of your sql database.
In tsf there are no. of feature avialbe like
Datacompare
Schemacompare
version controlling
about database version control : http://www.codinghorror.com/blog/2006/12/is-your-database-under-version-control.html
for more detail check : http://msdn.microsoft.com/en-us/library/ms364062(VS.80).aspx

We are using SQL Examiner for keeping database schema under version control. I've tried the VS2010 also, but in my opinion VS approach is too complex for small and mid-size projects. With SQL Examiner I mostly work with SSMS and use SQL Examiner to check-in updates to SVN (TFS and SourceSafe is supported also, but I never tried it).
Here is description of SQL Examiner's approach: How to get your database under version control

Try DBSourceTools. (http://dbsourcetools.codeplex.com)
Its open source, and specifically designed to script an entire database - tables, views, procs to disk, and then re-create that database through a deployment target.
You can script all data, or just specify which tables to script data for.
Additionally, you can zip up the results for distribution.
We use it for source control of databases, and to test update patches for new releases.
In the back-end it's built around SMO, and thus supports SQL 2000, 2005 and 2008.
DBDiff is integrated, to allow for schema comparisons.
Have fun,
- Nathan.

Related

Does sql pre and deploy runs in one session

I have SQL VS project with one Customers.PreDeployment1.sql file:
DROP TABLE Customers
My question is, why after the deletion is happens, the Customers.sql (Build action) is not running (the table is not getting created in sql server)?
CREATE TABLE [dbo].[Customers]
(
[Id] INT NOT NULL PRIMARY KEY,
[First_Name] NCHAR(10) NULL,
[Last_Name] NCHAR(10) NULL
)
I couldn't find any explanation for that... maybe because it consider it as one session?

Because the deletion doesn't happen until after visual studio has done its diff/compare and generated the script.
So the when the diff is performed (and the deploy script is created), the customers table is still in the DB, VS then adds pre-deploy and post deploy to the change script, then runs it.
I'm sure there's a good reason for this behaviour, but I've never found it. There are a few ways around the problem, but its not clear what you're trying to achieve by dropping the table in the predeploy?

Create table or only add changed/new columns

I have several tables which are worked on within a development environment, then moved to production. If they don't already exist in production, it's fine to just generate the table creation script from SSMS and run it. However, there are occasions where the table already exists in production but all that's needed is an extra column or constraint. The problem is knowing exactly what has changed.
Is there a way to get SQL to compare my CREATE TABLE statement against the existing table and only apply what has changed? Essentially I am trying to do the below and SQL correctly complains that the table exists already.
I would have to manually write an ALTER query which on a real example would be difficult due to the sheer volume of columns. Is there a better / easier way to see what has changed? Note that this involves two separate database servers.
CREATE TABLE suppliers
( supplier_id int NOT NULL,
supplier_name char(50) NOT NULL,
contact_name char(50),
CONSTRAINT suppliers_pk PRIMARY KEY (supplier_id)
);
CREATE TABLE suppliers
( supplier_id int NOT NULL,
supplier_name char(50) NOT NULL,
contact_name char(50),
contact_number char(20), --this has been added
CONSTRAINT suppliers_pk PRIMARY KEY (supplier_id)
);
Also, dropping and recreating wouldn't be a possibility because data would be lost.

SSMS can generate the schema change script if you make the change in the table designer (right-click on the table in Object Explorer and select Design). Then, instead of applying the change immediately, from the menu select Table Designer-->Generate Change Script. Note that depending on the change, SSMS may need to recreate the table, although data will be retained. SSMS requires you uncheck the option to "prevent saving changes that require table re-creation" under Tools-->Options-->Designers-->Table and Database Designers. Review the script to make sure you're good with it.
SQL Server Data Tools (SSDT) and third-party tools (e.g. from Red-Gate and ApexSQL) have schema-compare features to generate the needed DDL after the fact. There are also features like migration scripts to facilitate continuous integration and source control integration as well. I suggest you keep database objects under source control and leverage database tooling as part of your development process.

Typically we use something like database migrations for this, as a feature outside of the database. For example, in several of our C# apps we have a tool called FluentMigrator. We write a script that adds the new columns we need in code, to the dev database. When the project is debugged, FM will run the script and modify the dev db, the dev code uses the new columns and all is well. FM knows not to run the script again
When time comes to put something live, the FM script is a part of the release, the app is put live onto the website, the migrations run again updating the live db so the live code will use the new columns and still all is well..
If there is nothing outside of your sql server (not sure how you manage that, but..), then surely you must be writing scripts (or using gui to generate scripts) that alter the DB right? So just keep those scripts and run them as part of the process of "going live"
If you are looking at this from a perspective that these db already exist created by someone else and they threw away the scripts, then you can one time catch up using a Database Schema Compare tool. Microsoft have one in SSDT - see here for more info on how it is used:
https://msdn.microsoft.com/en-us/library/hh272690(v=vs.103).aspx

If you don't have many constraints I suggest you create a dynamic script to cast and import the data into your new tables. If this doesn't fail then you just drop the old tables and rename the newly created ones.

Using "SELECT INTO" with Azure SQL to copy data from another DB

I'm trying to automate the initialising of a SQL DB on Azure. For some (lookup) tables, data needs to be copied from a source DB into the new DB each time it is initialised.
To do this I execute a query containing
SELECT * INTO [target_db_name]..[my_table_name] FROM [source_db_name].dbo.[my_table_name]
At this point an exception is thrown telling me that
Reference to database and/or server name in 'source_db_name.dbo.my_table_name'
is not supported in this version of SQL Server.
Having looked into this, I've found that it's now possible to reference another Azure SQL DB provided it has been configured as an external data source. [here and here]
So, in my target DB I've executed the following statement:
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<password>';
CREATE DATABASE SCOPED CREDENTIAL cred
WITH IDENTITY = '<username>',
SECRET = '<password>';
CREATE EXTERNAL DATA SOURCE [source_db_name]
WITH
(
TYPE=RDBMS,
LOCATION='my_location.database.windows.net',
DATABASE_NAME='source_db_name',
CREDENTIAL= cred
);
CREATE EXTERNAL TABLE [dbo].[my_table_name](
[my_column_name] BIGINT NOT NULL
)
WITH
(
DATA_SOURCE = [source_db_name],
SCHEMA_NAME = 'dbo',
OBJECT_NAME = 'my_table_name'
)
But the SELECT INTO statement still yields the same exception.
Furthermore, a simple SELECT * FROM [source_db_name].[my_table_name] yields the exception "Invalid object name 'source_db_name.my_table_name'".
What am I missing?
UPDATE
I've found the problem: CREATE EXTERNAL TABLE creates what appears to be a table in the target DB. To query this, the source DB name should not be used. So where I was failing with:
SELECT * FROM [source_db_name].[my_table_name]
I see that I should really be querying
SELECT * FROM [my_table_name]

It looks like you might need to define that external table, according to what appears to be the correct syntax:
CREATE EXTERNAL TABLE [dbo].[source_table](
...
)
WITH
(
DATA_SOURCE = source_db_name
);
The three part name approach is unsupported, except through elastic database query.
Now, since you're creating an external table, the query can pretend the external table is an object native to our [target_db]- this allows you to write the query SELECT * FROM [my_table_name], as you figured out from your edits. From the documentation, it is important to note that "This allows for read-only querying of remote databases." So, this table object is not writable, but your question only mentioned reading from it to populate a new table.

As promised, here's how I handle database deploys for SQL Server. I use the same method for on-prem, Windows Azure SQL Database, or SQL on a VM in Azure. It took a lot of pain, trial and error.
It all starts with SQL Server Data Tools, SSDT
If you're not already using SSDT to manage your database as a project separate from your applications, you need to. Grab a copy here. If you are already running a version of Visual Studio on your machine, you can get a version of SSDT specific for that version of Visual Studio. If you aren't already running VS, then you can just grab SSDT and it will install the minimal Visual Studio components to get you going.
Setting up your first Database project is easy! Start a new Database project.
Then, right click on your database project and choose Import -> Database.
Now, you can point at your current development copy of your database and import it's schema into your project. This process will pull in all the tables, views, stored procedures, functions, etc from the source database. When you're finished you will see something like the following image.
There is a folder for each schema imported, as well as a security folder for defining the schemas in your database. Explore these folders and look through the files created.
You will find all the scripts created are the CREATE scripts. This is important to remember for managing the project. You can now save your new solution, and then check it into your current source control system. This is your initial commit.
Here's the new thought process to managing your database project. As you need to make schema changes, you will come into this project to make changes to these create statements to define the state you want the object to be. You are always creating CREATE statements, never ALTER statements in your schema. Check out the example below.
Updating a table
Let's say we've decided to start tracking changes on our dbo.ETLProcess table. We will need columns to track CreatedDateTime, CreatedByID, LastUpdatedDateTime, and LastUpdatedByID. Open the dbo.ETLProcess file in the dbo\Tables folder and you'll see the current version of the table looks like this:
CREATE TABLE [dbo].[ETLProcess] (
[ETLProcessID] INT IDENTITY (1, 1) NOT NULL
, [TenantID] INT NOT NULL
, [Name] NVARCHAR (255) NULL
, [Description] NVARCHAR (1000) NULL
, [Enabled] BIT DEFAULT ((1)) NOT NULL
, CONSTRAINT [PK_ETLProcess__ETLProcessID_TenantID]
PRIMARY KEY CLUSTERED ([ETLProcessID], [TenantID])
, CONSTRAINT [FK_ETLProcess_Tenant__TenantID]
FOREIGN KEY ([TenantID])
REFERENCES [dbo].[Tenant] ([TenantID])
);
To record the change we want to make, we simply add in the columns into the table like this:
CREATE TABLE [dbo].[ETLProcess] (
[ETLProcessID] INT IDENTITY (1, 1) NOT NULL
, [TenantID] INT NOT NULL
, [Name] NVARCHAR (255) NULL
, [Description] NVARCHAR (1000) NULL
, [Enabled] BIT DEFAULT ((1)) NOT NULL
, [CreatedDateTime] DATETIME DEFAULT(GETUTCDATE())
, [CreatedByID] INT
, [LastUpdatedDateTime] DATETIME DEFAULT(GETUTCDATE())
, [LastUpdatedByID] INT
, CONSTRAINT [PK_ETLProcess__ETLProcessID_TenantID]
PRIMARY KEY CLUSTERED ([ETLProcessID], [TenantID])
, CONSTRAINT [FK_ETLProcess_Tenant__TenantID]
FOREIGN KEY ([TenantID])
REFERENCES [dbo].[Tenant] ([TenantID])
);
I didn't add any foreign keys to the definition, but if you wanted to create them, you would add them below the Foreign Key to Tenant. Once you've made the changes to the file, save it.
The next thing you'll want to get in the habit of is checking your database to make sure it's valid. In the programming world, you'd run a test build to make sure it compiles. Here, we do something very similar. From the main menu hit Build -> Build Database1 (the name of our database project).
The output window will open and tell you if there are any problems with your project. This is where you'll see things like Foreign keys referencing tables that don't yet exist, bad syntax in your create object statements, etc. You'll want to clean these up before you check your update into source control. You'll have to fix them before you will be able to deploy your changes to your development environment.
Once your database project builds successfully and it's checked in to source control, you're ready for the next change in process.
Deploying Changes
Earlier I told you it was important to remember all your schema statements are CREATE statements. Here's why: SSDT gives you two ways to deploy your changes to a target instance. Both of them use these create statements to compare your project against the target. By comparing two create statements it can generate ALTER statements needed to get a target instance up to date with your project.
The two options for deploying these changes are a T-SQL change script, or dacpac. Based on the original post, it sounds like the change script will be most familiar.
Right click on your database project and choose Schema Compare.
By default, your database project will be the source on the left. Click Select target on the right, and select the database instance you want to "upgrade". Then click Compare in the upper left, and SSDT will compare the state of your project with the target database.
You will then get a list of all the objects in your target database that are not in the project (in the DROP section), a list of all objects that are different between the project and target database (in the ALTER Section), and a list of objects that are in your project and not yet in your target database (in the ADD section).
Sometimes you'll see changes listed that you don't want to make (changes in the Casing of your object names, or the number of parenthesis around your default statements. You can deselect changes like that. Other times you will not be ready to deploy those changes in the target deployment, you can also deselect those. All items left checked will either be changed in target database, if you choose update (red box below), or added to your change script (green box below), if you hit the "Generate Script" icon.
Handling lookup data in your Database Project
Now we're finally to your original question, how do I deploy lookup data to a target database. In your database project you can right click on the project in Solution Explorer and choose Add -> New Item. You'll get a dialog box. On the left, click on User Scripts, then on the right, choose Post-Deployment Script.
By adding a script of this type, SSDT knows you want to run this step after any schema changes. This is where you will enter your lookup values, as a result they're included in source control!
Now here's a very important note about these post deployment scripts. You need to be sure any T-SQL you add here will work if you call the script in a new database, in an existing database, or if you called it 100 times in a row. As a result of this requirement, I've taken to including all my lookup values in merge statements. That way I can handle inserts, updates, and deletes.
Before committing this file to source control, test it in all three scenarios above to be sure it won't fail.
Wrapping it all up
Moving from making changes directly in your target environments to using SSDT and source controlling your changes is a big step in the maturation of your software development life-cycle. The good news is it makes you think about your database as part of the deployment process in a way that is compatible with continuous integration/continuous deployment methods.
Once you get used to the new process, you can then learn how to add a dacpac generated from SSDT into your deployment scripts and have the changes pushed at just the right time in your deployment.
It also frees you from your SELECT INTO problem, your original problem.

How to alter SQL Table default data type during design

I would like to change the default data type when designing a table in SQL Server Management Studio Table Designer. My current default is nchar(10) and I am creating a table with a lot of integer data types. I looked in Tools Options but could not find anyplace to change this. I'm running SQL Server 2008 R2.

It is possible, but requires a modification of the registry.
This is a tiresome change to make every time you wish to change the default, so I agree with NYCdotNet.
HKEY_CURRENT_USER\Software\Microsoft\Microsoft SQL Server\100\Tools\Shell\DataProject

It sounds like you're ready to create your table using T-SQL and not the designer. A variation of the below code will cover you for putting together a basic schema and if you want to do more stuff you can always revise the table in the designer later.
CREATE TABLE MyTableName (
MyID INT NOT NULL IDENTITY(1,1),
MyColumn1 INT NOT NULL,
MyColumn2 INT NULL,
MyColumn3 VARCHAR(100) NULL,
PRIMARY KEY (MyID)
)

UPDATE: My apologies, read your question too fast. This solution is for the visual designer in VS2015, not SQLSMS. I will leave the answer up anyway.
This has been changed in Visual Studio 2015. It is now in:
Options > Database Tools > Table and Database Designers > Column Options

Alter SQL Server table column width with indexes

I am using SQL Server 2008 and need to alter a large number of columns across many tables from decimal(9,3) to decimal(12,6)
The issue I currently have is that some tables have several indexes on these columns and the alter statement fails due to the index.
Is there a way to alter the column without losing the index?
I am altering the column as follows:
alter table [TABLE_NAME] alter column [Conf_Tonnes] decimal(12,6) not null

I believe it is not possible to change the type of a column whilst it has any constraint on it. Certainly it used to be the case with earlier versions of SQL Server, and I don't think it has changed.
For practical purposes, you can use a script to list all fields of a certain type:
DECLARE #name AS varchar(20)
SET #name = '<Name of type>'
select T.name as "Table", F.name as "Field"
from sys.tables T left join sys.columns F on T.object_id=F.object_id
where F.user_type_id=(select user_type_id from sys.types where name=#name)
Which will give you the list of fields which need changing.
You can also drop constraints from fields but the difficult thing is how to recreate them.
if you have external meta-descriptions of the database, then you can use that to generate scripts easily. Alternatively, you could run the script generate tool - select all tables on, all options off, except tables and indexes - this should generate the full list of tables and indexes for you.
You can find it by right-clicking on the database in object explorer/tasks/generate scripts.
Unfortunately I don't think you can get index scripts generated without having table create scripts created as well - but Visual Studio text editing scripts shoudl make the job of cutting out the bits you don't want fairly easy.
Given time, it's probably possible to put together some scripts to do the whole job automatically, and it would give you a decent set of tools for future use.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas