What are some good methods to push schema updates to end user databases? - sql

This might be too broad, but it's a problem I'm having a bear of a time dealing with. We have an application that we distribute to our end users. It's running on top of a derby back end. We can push out code changes fairly easily, it'll go out to our server, see there's a new version, download, overwrite old code, and reboot.
But, as we change our code, we also alter the schema of the derby database. We don't have great methods to update this. Currently we can push SQL updates via FTP. When the program is connected to the internet, it looks for new SQL files, downloads them and runs.
Unfortunately a lot of our clients have limited Internet access, so they get these updates intermittently. Sometimes because they changes are big enough, their local DB schema gets out of sync with what we want. Or they get the code changes via CD but not the SQL changes (someone mails them the CD).
What I've been trying to do is create a SOAP service that can serve up XML representations of the schema. It's been a huge PITA to develop so far.
What are some methods people are currently using to maintain databases like this? I feel like I'm not the first to do this, so there might be better ways than what I'm doing.
Based on some comments here, here's an update:
Basically, I think we screwed ourselves early on by not adhering to a strict versioning of the DB, so I don't know how everyone's DB is at. A lot of people got custom installs built (groan at will). I need a tool that can tell the differences between their DB and a "official" copy.
I have a tool built, it kind of works, but there's so…many…things to keep track of.

Can you distribute the DB changes as part of the code changes? Then, when the app restarts, it checks if it needs to run any updates on the DB.
Obviously, you'll need to version the DB schema to avoid applying the same update more than once.
I know some applications that do this (mostly in Ruby, but also in Java).

If you already have an update mechanism in place in your application that can download a program to alter the installed source code, why not package and run the schema changes as a part of that upgrade process? I would just run the updates as a part of the Java application then.
My team at work handles these changes by using the MyBatis Migration tool, which represents each schema change as a single migration script which contains the "make change" and "rollback" steps. A changelog table is stored in the database which lists which updates have been applied to that database, which makes it easy for the migrate command to determine which updates it needs to apply when run. This specific tool is probably only really useful when you control the database and have the ability to run shell commands and scripts to alter the database, but you can use the same concepts in your approach - package each schema change as an atomic unit and run them from within your program to bring the schema up to the current version, which you can track in the db itself.

You'll need a table containing the version of the database that the user is running, and then you'll need code to upgrade from version n to version n+1. Assuming you have a database user that has access to do schema changes, you can apply schema changes the same way you're now applying code changes.

Related

Best Practices of continuous Integration with SQL Server project or local mdf file in project

Today I maintain project that has really messy DB that need a lot of refactor and publish on clients machines.
I know that I could add a SQL Server Database project that contains just scripts of the database and creates a .dacpac file that allows me to change clients databases automatically.
Also I know that I could just add an .mdf file to the App_Data or even to Solution_Data folder and have my database there. I suppose that localDb that already exists allows me to startup my solution without SQL Server
And atlast i know that Entity Framework exist with it's own migrations. But i don't want to use it, besouse i can't add and change indexes with it's migrations and i don't have anought flexibility when i need to describe difficult migrations scenarios.
My goals:
Generate migration scripts to clients DB's automaticaly.
Make my solution self-contained, that any new Programmer that came to project don't even need to install SQL Server on his machine.
Be able to update local (development) base in 1-2 clicks.
Be able to move back in history of db changes (I have TFS server)
Be able to have clean (only with dictionaries or lookup tables) db in solution with up to date DB scheme.
Additionally i want to be able to update my DB model (EF or .dbml) automatically or very easy way.
So what I what to ask:
What's a strengths and weaknesses of using this 2 approaches if I want to achive my goals?
Can be that I should use sort of combination of this tools?
Or don't I know about other existing tool from MS?
Is there a way to update my DAL model from this DB?
What's a strengths and weaknesses of using this 2 approaches if I want to achive my goals?
Using a database project allows you to version control all of the database objects. You can publish to various database instances and roll out changes incrementally, rather than having to drop and recreate the database, thus preserving data. These changes can be in the form of a dacpac, a SQL script, or done right through the VS interface. You gain a lot of control over deployments using pre- and post-deployment scripts and publishing profiles. Developers will be required to install SQL Server (the developer/express edition is usually good enough).
LocalDB is a little easier to work with -- you can make your changes directly in the database without having to publish. LocalDB doesn't have a built-in publish process for pushing changes to other instances. No SQL Server installation required.
Use a database project if you need version control for your database objects, if you have multiple users concurrently making changes, or if you have multiple applications that use the same database. Use LocalDB if none of those conditions apply or for small apps that require their own standalone database.
Can be that I should use sort of combination of this tools?
Yes. According to Kevin's comment below, "If the Database Project is set as your startup project, hitting F5 will automatically deploy it to LocalDB. You don't even need a publish profile in this case."
Or don't I know about other existing tool from MS?
Entity Framework's Code First approach comes close.
Is there a way to update my DAL model from this DB?
Entity Framework's POCO generator works well unless you make changes to your DAL classes, then those changes get lost the next time you run the generator.
There is a new tool called SqlSharpener which can generate classes from the SQL files in a database project. I have not used it so I cannot vouch for it but it looks promising.
One way for generating client script for DB changes is to use database modeling tool like ERWin Which have a free community edition. The best way to meet your database version control requirement and easy script generation is Redgate SQL Source Control. Using Redgate tool you will meet the first five goals mentioned. Moreover, you can now update EF Model by single click after changing DB schema (i.e. Database first approach) as required in goal 6.
I do not recommend using LocalDB at all. It always make issues with source control like "DB File is in use and can't commit...” In addition, the developer in the project will not have common set of updated data to work on unless a developer add test data to the database and ask others to get latest version and overwrite their own database Or generate update script by the previous mentioned tool and ask every developer to run it on his localDB.
The best way in your situation is to use SQL Server on network. A master version that all the developers use. Since you have version control on the database using previously mentioned tool, you can rollback any buggy change in the database server.
If you think that RedGate tool is expensive for the budget of your project. A second approach is to generate single SQL file from your database that has all database object and the other developers update the SQL file in source control per their changes. This can be done easily by using schema compare tool in visual studio and appending the generated script to SQL file in the source control. With EF DB First approach, you will not have to add many migration classes as in EF Code first.

Keeping a database Schema upto date

I'm writing an application that is using a database (currently MySQL 4) to store data.
It is likely that I will make changes to this in the form of updates later to add additional data. Updating the application is simple, it essentially comes down to overwriting the program files with the new ones. However how do I go about updating the database schema?
The database is remote and so my application might exist in several places, so simply dumping the ALTER and CREATE statements in an installer would result in the changes being made multiple times, and I have been asked explicitly for an automatic solution that allows for the application copies to be updated over a transition period, and for schema updates to be automatic.
I considered examining the schema at start-up to look for missing tables and columns, and adding them as needed, however this does not seem like a clean solution. I also considered putting some kind of “schema version” number on the database, but can’t see any way to do this short of a single row table with an int “Version” column which doesn’t seem a good way either.
I can highly recommend Liquibase. It really does work - I've used it and was very impressed.
Essentially, it keeps its own log of statements run on a database and runs them only if not already run/needed. It is XML driven and allows you to use optional pre- and post-execution statements and conditions. You check your XML files into your source control and invoke it from your build tool. It's even suitable for driving production releases.
It's magic.
Rather than rolling your own system for versioning your database it's probably worth looking into an existing framework that will manage it for you.
I use liquibase and have integrated into my build using the maven plugin. Worth checking out!
Just as you proposed, add a table where you store the current version of the database schema. Then you only have to apply the changes between your last schema update and the new release, and set the new version number accordingly. I've done this to update our production database about 300 times, it just works.

SQL Server database change workflow best practices

The Background
My group has 4 SQL Server Databases:
Production
UAT
Test
Dev
I work in the Dev environment. When the time comes to promote the objects I've been working on (tables, views, functions, stored procs) I make a request of my manager, who promotes to Test. After testing, she submits a request to an Admin who promotes to UAT. After successful user testing, the same Admin promotes to Production.
The Problem
The entire process is awkward for a few reasons.
Each person must manually track their changes. If I update, add, remove any objects I need to track them so that my promotion request contains everything I've done. In theory, if I miss something testing or UAT should catch it, but this isn't certain and it's a waste of the tester's time, anyway.
Lots of changes I make are iterative and done in a GUI, which means there's no record of what changes I made, only the end result (at least as far as I know).
We're in the fairly early stages of building out a data mart, so the majority of the changes made, at least count-wise, are minor things: changing the data type for a column, altering the names of tables as we crystallize what they'll be used for, tweaking functions and stored procs, etc.
The Question
People have been doing this kind of work for decades, so I imagine there have got to be a much better way to manage the process. What I would love is if I could run a diff between two databases to see how the structure was different, use that diff to generate a change script, use that change script as my promotion request. Is this possible? If not, are there any other ways to organize this process?
For the record, we're a 100% Microsoft shop, just now updating everything to SQL Server 2008, so any tools available in that package would be fair game.
I should clarify I'm not necessarily looking for diff tools. If that's the best way to sync our environments then it's fine, but if there's a better way I'm looking for that.
An example doing what I want really well are migrations in Ruby on Rails. Dead simple syntax, all changes are well documented automatically and by default, determining what migrations need to run is almost trivially easy. I'd love if there was something similar to this for SQL Server.
My ideal solution is 1) easy and 2) hard to mess up. Rails Migrations are both; everything I've done so far on SQL Server is neither.
Within our team, we handle database changes like this:
We (re-)generate a script which creates the complete database and check it into version control together with the other changes. We have 4 files: tables, user defined functions and views, stored procedures, and permissions. This is completely automated - only a double-click is needed to generate the script.
If a developer has to make changes to the database, she does so on her local db.
For every change, we create update scripts. Those are easy to create: The developer regenerates the db script of his local db. All the changes are now easy to identify thanks to version control. Most changes (new tables, new views etc) can simply be copied to the update script, other changes (adding columns for example) need to be created manually.
The update script is tested either on our common dev database, or by rolling back the local db to the last backup - which was created before starting to change the database. If it passes, it's time to commit the changes.
The update scripts follow a naming convention so everybody knows in which order to execute them.
This works fairly well for us, but still needs some coordination if several developers modify heavily the same tables and views. This doesn't happen often though.
The important points are:
database structure is only modified by scripts, except for the local developer's db. This is important.
SQL scripts are versioned by source control - the db can be created as it was at any point in the past
database backups are created regularly - at least before making changes to the db
changes to the db can be done quickly - because the scripts for those changes are created relatively easily.
However, if you have a lot of long lasting development branches for your projects, this may not work well.
It is by far not a perfect solution, and some special precautions are to be taken. For example, if there are updates which may fail depending on the data present in a database, the update should be tested on a copy of the production database.
In contrast to rails migrations, we do not create scripts to reverse the changes of an update. But this isn't always possible anyway, at least in respect to the data (the content of a dropped column is lost even if you recreate the column).
Version Control and your Database
The root of all things evil is making changes in the UI. SSMS is a DBA tool, not a developer one. Developers must use scripts to do any sort of changes to the database model/schema. Versioning your metadata and having upgrade script from every version N to version N+1 is the only way that is proven to work reliably. It is the solution SQL Server itself deploys to keep track of metadata changes (resource db changes).
Comparison tools like SQL Compare or vsdbcmd and .dbschema files from VS Database projects are just last resorts for shops that fail to do a proper versioned approach. They work in simple scenarios, but I see them all fail spectacularly in serious deployments. One just does not trust a tool to do a change to +5TB table if the tools tries to copy the data...
RedGate sells SQL Compare, an excellent tool to generate change scripts.
Visual Studio also has editions which support database compares. This was formerly called Database Edition.
Where I work, we abolished the Dev/Test/UAT/Prod separation long ago in favor of a very quick release cycle. If we put something broken in production, we will fix it quickly. Our customers are certainly happier, but in the risk avert corporate enterprise, it can be a hard sell.
There are several tools available for you. One is from Red-Gate called SQL Compare. Awesome and highly recommended. SQL Compare will let you do a diff in schemas between two databases and even build the sql change scripts for you.
Note they have been working on a SQL Server source control product for awhile now as well.
Another (if you're a visual studio shop) is the schema and data compare features that is part of Visual Studio (not sure which versions).
Agree that SQL Compare is an amazing tool.
However, we do not make any changes to the database structure or objects that are not scripted and saved in source control just like all other code. Then you know exactly what belongs in the version you are promoting because you have the scripts for that particular version.
It is a bad idea anyway to make structural changes through the GUI. If you havea lot of data, it is far slower than using alter table at least in SQL Server. You only want to use tested scripts to make changes to prod as well.
I agree with the comments made by marapet, where each change must be scripted.
The problem that you may be experiencing, however, is creating, testing and tracking these scripts.
Have a look at the patching engine used in DBSourceTools.
http://dbsourcetools.codeplex.com
It's been specifically designed to help developers get SQL server databases under source-code control.
This tool will allow you to baseline your database at a specific point, and create a named version (v1).
Then, create a deployment target - and increment the named version to v2.
Add patch scripts to the Patches directory for any changes to schema or data.
Finally, check the database and all patches into source-code control, to distribute with devs.
What this gives you is a repeatable process to test all patches to be applied from v1 to v2.
DBSourceTools also has functionality to help you create these scripts, i.e. schema compare or script data tools.
Once you are done, simply send all of the files in the patches directory to your DBA to upgrade from v1 to v2.
Have fun.
Another "Diff" tool for databases:
http://www.xsqlsoftware.com/Product/Sql_Data_Compare.aspx
Keep database version in a versioning table
Keep script file name that was successfully applied
Keep md5 sum of each sql script that has been applied. It should ignore spaces when calculate md5 sum. Must be effective.
Keep info about who applied a script Keep info about when a script was applied
Database should be verified on application start-up
New sql script should be applied automatically
If md5 sum of a script that was already applied is changed, error should be thrown (in a production mode)
When script have been released it must not be changed. It must be
immutable in a production environment.
Script should be written in a way, so it could be applied to different types of database (see liquibase)
Since most ddl statements are auto-committing on most databases, it is best to have a single ddl statement per SQL script.
DDL sql statement should be run in a way, so it can be executed several times without errors. Really helps in a dev mode, when you may edit script several times. For instance, create a new table, only if it does not exist, or even drop table before creating a new one. It will help you in a dev mode, with a script that has not been released, change it, clear md5 sum for this script, rerun it again.
Each sql script should be run in its own transaction.
Triggers/procedures should be dropped and created after each db
update.
Sql script is kept in a versioning system like svn
Name of a script contains date when it was committed, existing (jira) issue id, small description
Avoid adding rollback functionality in scripts (liquibase allow to do that). It makes them more complicated to write and support. If you use exactly one ddl statement per script, and dml statements are run within a
transaction, even failing a script will not be a big trouble to
resolve it
This is the workflow we have been using succesfully:
Development instance: SQL objects are created/updated/deleted in DB using MSSQL Studio and all operations are saved to scritps we include in each version of our code.
Moving to production: We compare schema between dev and prod db using SQL Schema Compare in Microsoft Visual Studio. We update prod using the same tool.

What's your process for dealing with database schema changes in a dev team?

here's a more general question on how you handle database schema changes in a development team.
We are a team of developers and the databases used during development are running locally on everyone's box as we want to avoid the requirement to have web access all the time. So running a single central database instance somewhere is not a real option.
Whenever one of us decides that it is time to extend/change the db schema, we mail database files (MYI/MYD) or SQL files to execute around, or give others instructions on the phone what they need to do to get the changed code running on their local DBs. That's not the perfect approach for sure. The same problem arises when we need to adjust the DB schema on staging or production once a new release is ready.
I was wondering ... how do you guys handle this kind of stuff? For source code, we use SVN.
Really appreciate your input!
Thanks,
Michael
One approach we've used in the past is to script the entire DDL for the database, along with any test/setup data needed. Store that in SVN, then when there's a change, any developer can pull down the changes, drop the database, and rebuild it from the script files.
At the very least you should have the scripts of all the objects in the database (tables, stored procedures, etc) under source control.
I don't think mailing schema changes is a real option for a professional development team.
We had a system on one of my previous teams that was the best I've encountered for dealing with this situation.
The nightly build of the application included a build of a database (SQL Server). The database got built to the Test DB server. Each developer then had a DTS package (this was a while ago, and I'm sure they upgraded to SSIS packages) to pull down that nightly DB build to their local DB environment.
This kept the master copy in one location and put the onus on the developers to keep their local dev databases fresh.
At my work, we deal with pretty large databases that are time-consuming to generate, so for us, starting from scratch with a new DB isn't ideal. Like Harper, we have our DDL in SVN. Additionally, we store a version number in a database table. Every check-in that changes the DB must be accompanied by a script that:
Will upgrade the database schema and modify any existing data appropriately, and
Will update the version number in the database.
Further, we number the scripts and database versions such that a script we've written knows how to upgrade further along a branch or from an older branch to a newer one without any input from the developer (apart from the database name and the directory to the upgrade scripts).
Thus, if I've got a copy of a customer's 4GB DB that's from a year old version and I want to test how their data will work with the version we cut yesterday, I can just run our script and let it handle the upgrades rather than having to start from scratch and redo every INSERT, UPDATE and DELETE performed since the database was created.
We have a non-SQL description of the database schema. When the application starts, it compares the desired database schema with the actual database schema, and performs whatever ADD TABLE, ADD COLUMN, ADD INDEX, etc. statements it needs to do to get the database to look right.
This doesn't handle every case; sometimes you have to delete the database and recreate if if you've changed something that the schema resolver can't handle, but most of the time we don't need to worry about it.
I'd certainly keep the database schema in source code control.
At my present job, every time there's a schema change, we write the SQL for the change (alter table xyz add column ...) and put it in SVN. Then developers can update test databases by running this script. It's pretty clumsy but it works.
At a previous job I wrote some code that at application start-up would automatically compare the actual database schema to what it expected, and if it was not up to date perform the updates. Mostly this was done for deployment reasons: When we shipped new copies of the software, it would then automatically update the user's database. But it was also handy for developers.
I think there should be some generic SQL tool to do this. Maybe there is, but I've never seen one.

Tools to work with stored procedures in Oracle, in a team?

What tools do you use to develop Oracle stored procedures, in a team :
To automatically "lock" the current procedure you are working with, so nobody else in the team can make changes to it until you are finished.
To automatically send the changes you make in the stored procedure, in an Oracle database, to a Subversion, CVS, ... repository
Thanks!
I'm not sure if the original poster is still monitoring this, but I'll ask the question anyways.
The original post requested to be able to:
To automatically "lock" the current
procedure you are working with, so
nobody else in the team can make
changes to it until you are finished.
Perhaps the problem here is one of development paradigm more than the inability of a product to "lock" the stored proc. Whenever I hear "I want to lock this so noone else changes it" I immediately get the feeling that people are sharing a schema and everyone is developing in the same space.
If this is the case, why not simply let everyone have their own schema with a copy of the data model? I mean seriously folks, it doesn't "cost" anything to create another schema. That way, each developer can make changes until they're blue in the face without affecting anyone else.
Another trick I've used in the past (on small teams) when it wasn't feasible to let every developer have their own copy of the data because of size, was to have a master schema with all the tables and code in it, with public synonyms pointing to it all. Then, if the developer wants to work on a stored proc, he simply creates it in his schema. That way Oracle name resolution finds that one first instead of the copy in the master schema, allowing them to test their code without affecting anyone else. This does have it's drawbacks, but this was a very specific case where we could live with them. I would NEVER implement something like this in production obviously.
As for the second requirement:
To automatically send the changes you
make in the stored procedure, in an
Oracle database, to a Subversion, CVS,
... repository
I'd be surprised to find tools out there smart enough to do this (perhaps an opportunity :). It would have to connect to your db, query the data dictionary (USER_SOURCE) and pull out the associated text. A tall order for source control systems where are almost universally file based.
Oracle's new SQL Developer has version control built-in.
Here is a link to the product.
http://www.oracle.com/technology/products/database/sql_developer/files/what_is_sqldev.html
http://www.oracle.com/technology/products/database/sql_developer/images/what_version.png http://www.oracle.com/technology/products/database/sql_developer/images/what_version.png
Treat PL/SQL as usual code : store it in files, and manage these files with your revision control tool and your internal procedures.
If you do not already have a revision control tool, then write your requirements down and pick one up. A lot of people it seems use Subversion, associated to TortoiseSVN as a client on Windows (I do).
The thing is : use your tool as is recommended, and adapt your procedures accordingly. For instance, Subversion uses a copy-modify-merge model by default, as opposed to a lock-modify-unlock model which you seem to favor.
In my case, I like to use TortoiseSVN, as stated above. And as is usual with this tool :
I never lock any files. This is very manageable with small teams, and it requires ahead planning on larger ones, which is always a good thing IMHO.
I send my changes manually back to the server, because ... I don't think there's another way with Subversion (plus, internal procedures forbid a commit without a message, which is also a good thing IMHO).
And whatever your choice, I recommend reading this post (and related ones) about database versioning.
A relatively simple (if slightly old-fashioned) solution might be to use a "locking" rather than "merge" mode version control system.... Subversion or CVS generally use a "merge" mode (although I believe Subversion can be made to "lock" files?)
"Locking" mode version control systems do have their own drawbacks of course.....
The only way I can think of doing in in Oracle might be some of of BEFORE CREATE TRIGGER, maybe referencing a table to look-up who can run a package in. Sounds a bit nasty though?
Using Source Control for Oracle you get a lot of what you're looking for.
Stored procedures (as well as packages, functions, tables etc.) can be locked manually using the interface, not automatically, but this does prevent others making changes.
The new SQL to create the object can then be checked into SVN or TFS (no CVS support unfortunately).
The tool is not free but has a free 28-day trial.
Using Oracle SQL Developer 1.5, you can easily create and manage connections to CVS or Subversion. To create a CVS connection (for example), click Versioning -> CVS -> Check out Module. You will run through a wizard to create the connection (host, username, etc), then you can check your procedures/functions out and in as normal.
Integration with CVS is also provided in Toad.
You may also want to look at Aqua Data Studio. They have built in SVN as well and is a great Stored Proc editor.
After searching for a tool to handle version control for Oracle objects with no luck we created the following (not perfect but suitable) solution:
Using dbms_metadata package we create the metadata dump of our Oracle server. We create one file per object, hence the result is not one huge file but a bunch of files. For recognizing deleted object we delete all the files before creating the dump again.
We copy all the files from the server to the client computer.
Using Netbeans we recognize the changes, and commit the changes to the CVS server (or check the diffs...). Any CVS-handler software would work here, but we were already using Netbeans for other purposes. And Netbeans also allows to create an ant task for calling the Oracle process mentioned in step 1, copying the files mention in step 2...
Here is the most imporant query for step 1:
SELECT object_type, object_name,
dbms_metadata.get_ddl(object_type, object_name) object_ddl FROM user_objects
WHERE OBJECT_TYPE in ('INDEX', 'TRIGGER', 'TABLE', 'VIEW', 'PACKAGE',
'FUNCTION', 'PROCEDURE', 'SYNONYM', 'TYPE')
ORDER BY OBJECT_TYPE, OBJECT_NAME
One file per object approach helps to identify the changes. If I add a field to table TTTT (not a real table name of course) then only TABLE_TTTT.SQL file will be modified.
Both step 1 and step 3 are slow processes. (several minutes for a few thousand of files)
Toad also does this without requiring CVS / SVN.