Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
How do you prepare your SQL deltas? do you manually save each schema-changing SQL to a delta folder, or do you have some kind of an automated diffing process?
I am interested in conventions for versioning database schema along with the source code. Perhaps a pre-commit hook that diffs the schema?
Also, what options for diffing deltas exist aside from DbDeploy?
EDIT: seeing the answers I would like to clarify that I am familiar with the standard scheme for running a database migration using deltas. My question is about creating the deltas themselves, preferably automatically.
Also, the versioning is for PHP and MySQL if it makes a difference. (No Ruby solutions please).
See
Is there a version control system for database structure changes?
How do I version my MS SQL database in SVN?
and Jeff's article
Get Your Database Under Version Control
I feel your pain, and I wish there were a better answer. This might be closer to what you were looking for.
Mechanisms for tracking DB schema changes
Generally, I feel there is no adequate, accepted solution to this, and I roll my own in this area.
You might take a look at another, similar thread: How do I version my MS SQL database in SVN?.
If you are still looking for options : have a look at neXtep designer. It is a free GPL database development environment based on the concepts of version control. In the environment you always work with versioned entities and can focus on the data model development. Once a release is done, the SQL generation engine plugged on the version control system can generate any delta you need between 2 versions, and will offer you some delivery mechanism if you need.
Among other things, you can synchronize and reverse synchronize your database during developments, create data model diagrams, query your database using integrated SQL clients, etc.
Have a look at the wiki for more information :
http://www.nextep-softwares.com/wiki
It currently supports Oracle, MySql and PostgreSql and is in java so the product runs on windows, linux and mac.
I don't manage deltas. I make changes to a master database and have a tool that creates an XML based build script based on the master database.
When it comes time to upgrade an existing database I have a program that uses the XML based build script to create a new database and the bare tables. I then copy the data over from the old database using INSERT INTO x SELECT FROM y and then apply all indexes, constraints and triggers.
New tables, new columns, deleted columns all get handled automatically and with a few little tricks to adjust the copy routine I can handle column renames, column type changes and other basic refactorings.
I wouldn't recommend this solution on a database with a huge amount of data but I regularly update a database that is over 1GB with 400 tables.
I make sure that schema changes are always additive. So I don't drop columns and tables, because that would zap the data and cannot be rolled back later. This way the code that uses the database can be rolled back without losing data or functionality.
I have a migration script that contains statements that creates tables and columns if they don't exist yet and fills them with data.
The migration script runs whenever the production code is updated and after new installs.
When I would like to drop something, I do it by removing them from the database install script and the migration script so these obsolete schema elements will be gradually phased out in new installs. With the disadvantage that new installs cannot downgrade to an older version before the install.
And of course I execute DDLs via these scripts and never directly on the database to keep things in sync.
You didn't mention which RDBMS you're using, but if it's MS SQL Server, Red-Gate's SQL Compare has been indispensable to us in creating deltas between object creation scripts.
I'm not one to toot my own horn, but I've developed an internal web app to track changes to database schemas and create versioned update scripts.
This tool is called Brazil and is now open source under a MIT license. Brazil is ruby / ruby on rails based and supports change deployment to any database that Ruby DBI supports (MySQL, ODBC, Oracle, Postgres, SQLite).
Support for putting the update scripts in version control is planned.
http://bitbucket.org/idler/mmp - schema versioning tool for mysql, writed in PHP
We're exporting the data to a portable format (using our toolchain), then importing it to a new schema. no need for delta SQL. Highly recommended.
I use Firebird database for most development and I use FlameRobin administration tool for it. It has a nice option to log all changes. It can log everything to a one big file, or one file per database change. I use this second option, and then I store each script in version control software - earlier I used Subversion, now I use Git.
I assume you can find some MySQL tool that has the same logging feature like FlameRobin does for Firebird.
In one of database tables, I store the version number of the database structure, so I can upgrade any database easily. I also wrote a simple PHP script that executes those SQL scripts one by one on any target database (database path and username/password are supplied on the command line).
There's also an option to log all DML (insert, update delete) statements, and I activate this while modifying some 'default' data that each database contains.
I wrote a nice white paper on how I do all this in detail. You can download the paper in .pdf format along with demo PHP scripts from here.
I also developed a set of PHP scripts where developers can submit their deltasql scripts to a central repository.
In one of the database tables (called TBSYNCHRONIZE), I store the version number of the latest executed script, so I can upgrade any database easily by using the web interface or a client developed on purpose for Eclipse.
The web interface allows to manage several projects. It supports also database "branches".
You can test the application at http://www.gpu-grid.net/deltasql (if you login as admin with password testdbsync).
The application is open source and can be downloaded here:
http://sourceforge.net/projects/deltasql
deltasql is used productively in Switzerland and India, and is popular in Japan.
Some months ago I searched tool for versioning MySQL schema. I found many useful tools, like Doctrine migration, RoR migration, some tools writen in Java and Python.
But no one of them was satisfied my requirements.
My requirements:
No requirements , exclude PHP and MySQL
No schema configuration files, like schema.yml in Doctrine
Able to read current schema from connection and create new migration script, than represent identical schema in other installations of application.
I started to write my migration tool, and today I have beta version.
Please, try it, if you have an interest in this topic.
Please send me future requests and bugreports.
Source code: bitbucket.org/idler/mmp/src
Overview in English: bitbucket.org/idler/mmp/wiki/Home
Overview in Russian: antonoff.info/development/mysql-migration-with-php-project
I use http://code.google.com/p/oracle-ddl2svn/
I am interested in this topic too.
There are some discussions on this topic in the Django wiki.
Interestingly, it looks like CakePHP has schema versioning built-in using just cake schema generate command.
I am using strict versioning of the database schema (tracked in a separate table). Scripts are stored in version control, but they all verify current schema version before making any change.
Here is the full implementation for SQL Server (the same solution could be developed for MySQL if needed): How to Maintain SQL Server Database Schema Version
For MySQL
When I land on a new DB:
Firstly, I check structure:
mysqldump --no-data --skip-comments --skip-extended-insert -h __DB_HOSTNAME__ -u __DB_USERNAME__ -p __DB1_NAME__ | sed 's/ AUTO_INCREMENT=[0-9]*//g' > FILENAME_1.sql
mysqldump --no-data --skip-comments --skip-extended-insert -h __DB_HOSTNAME__ -u __DB_USERNAME__ -p __DB2_NAME__ | sed 's/ AUTO_INCREMENT=[0-9]*//g' > FILENAME_2.sql
diff FILENAME_1.sql FILENAME_2.sql > DIFF_FILENAME.txt
cat DIFF_FILENAME.txt | less
Thanks to stackoverflow users I could write this quick script to find structure differences.
src : https://stackoverflow.com/a/8718572/4457531 & https://stackoverflow.com/a/26328331/4457531
In a second step, I check datas, table by table with mysqldiff. It's a bit archaic but a php loop based on information_schema datas make job surely
For versioning, I use the same way but I format a SQL update script (to upgrade or rollback) with diff results and I use version number convention (with several modifications the version number look like an ip address).
initial version : 1.0.0
^ ^ ^
| | |
structure change: - | |
datas added: -------- |
datas updated: --------
Related
I switched to Corda Enterprise mainly to try how it handles automated database migration.
In the documentation here it says tools-database-manager generates only SQL version of Liquibase script for initial DB and SQL version is Database specific so should not be used for production.
But it is possible to generate the XML also with liqubase cmd using this command:
/snap/bin/liquibase --url="jdbc:h2:tcp://localhost:10039/node" --driver=org.h2.Driver --classpath=/home/corda/Downloads/h2.jar generateChangeLog
which I did, and then I had to remove all the chnagelogs which are related to corda internal tables, and left only the ones that are my own and it seems everything works.
So the question is - may this approach have some hidden dangers that I don't know. Why otherwise Corda team developed tools-database-manager, and why they don't yet support xml generation with tools-database-manager?
And this leads to another question - what if I for example forget to include one of my tables in the initial script? Seems corda does not complain about it. Won't my table be created? Will I be able to ever migrate that table if it is missing in the initial script?
Firstly tools-database-manager is a helper tool available to make it easy for developers to perform database migration.
Let’s say you have 2 nodes in your network, each using a different database. PartyA uses PostgreSQL and PartyB uses Oracle. If PartyA uses this tool to create the migration script by connecting to PostgreSQL, this will out SQL statements specific to PostgreSQL.
Hence this is not portable and hence it's said the generated script is database specific.
Also, you do not want to trust a script and fire it directly on your production database, it contains DDL statements, so it is strongly recommended that every time a script is generated, make sure you know what the script is doing by manually looking into it.
There are a lot of enhancements going on in this space, supporting XML for migration script being one of them.
As mentioned earlier, you should manually look at the migration script. If you forget to add one of your table, Corda will not complain. It will fail sometime later when from within your code you try to access this table.
Yes, you can stop the node and create the table again by adding a create table script.
Today I maintain project that has really messy DB that need a lot of refactor and publish on clients machines.
I know that I could add a SQL Server Database project that contains just scripts of the database and creates a .dacpac file that allows me to change clients databases automatically.
Also I know that I could just add an .mdf file to the App_Data or even to Solution_Data folder and have my database there. I suppose that localDb that already exists allows me to startup my solution without SQL Server
And atlast i know that Entity Framework exist with it's own migrations. But i don't want to use it, besouse i can't add and change indexes with it's migrations and i don't have anought flexibility when i need to describe difficult migrations scenarios.
My goals:
Generate migration scripts to clients DB's automaticaly.
Make my solution self-contained, that any new Programmer that came to project don't even need to install SQL Server on his machine.
Be able to update local (development) base in 1-2 clicks.
Be able to move back in history of db changes (I have TFS server)
Be able to have clean (only with dictionaries or lookup tables) db in solution with up to date DB scheme.
Additionally i want to be able to update my DB model (EF or .dbml) automatically or very easy way.
So what I what to ask:
What's a strengths and weaknesses of using this 2 approaches if I want to achive my goals?
Can be that I should use sort of combination of this tools?
Or don't I know about other existing tool from MS?
Is there a way to update my DAL model from this DB?
What's a strengths and weaknesses of using this 2 approaches if I want to achive my goals?
Using a database project allows you to version control all of the database objects. You can publish to various database instances and roll out changes incrementally, rather than having to drop and recreate the database, thus preserving data. These changes can be in the form of a dacpac, a SQL script, or done right through the VS interface. You gain a lot of control over deployments using pre- and post-deployment scripts and publishing profiles. Developers will be required to install SQL Server (the developer/express edition is usually good enough).
LocalDB is a little easier to work with -- you can make your changes directly in the database without having to publish. LocalDB doesn't have a built-in publish process for pushing changes to other instances. No SQL Server installation required.
Use a database project if you need version control for your database objects, if you have multiple users concurrently making changes, or if you have multiple applications that use the same database. Use LocalDB if none of those conditions apply or for small apps that require their own standalone database.
Can be that I should use sort of combination of this tools?
Yes. According to Kevin's comment below, "If the Database Project is set as your startup project, hitting F5 will automatically deploy it to LocalDB. You don't even need a publish profile in this case."
Or don't I know about other existing tool from MS?
Entity Framework's Code First approach comes close.
Is there a way to update my DAL model from this DB?
Entity Framework's POCO generator works well unless you make changes to your DAL classes, then those changes get lost the next time you run the generator.
There is a new tool called SqlSharpener which can generate classes from the SQL files in a database project. I have not used it so I cannot vouch for it but it looks promising.
One way for generating client script for DB changes is to use database modeling tool like ERWin Which have a free community edition. The best way to meet your database version control requirement and easy script generation is Redgate SQL Source Control. Using Redgate tool you will meet the first five goals mentioned. Moreover, you can now update EF Model by single click after changing DB schema (i.e. Database first approach) as required in goal 6.
I do not recommend using LocalDB at all. It always make issues with source control like "DB File is in use and can't commit...” In addition, the developer in the project will not have common set of updated data to work on unless a developer add test data to the database and ask others to get latest version and overwrite their own database Or generate update script by the previous mentioned tool and ask every developer to run it on his localDB.
The best way in your situation is to use SQL Server on network. A master version that all the developers use. Since you have version control on the database using previously mentioned tool, you can rollback any buggy change in the database server.
If you think that RedGate tool is expensive for the budget of your project. A second approach is to generate single SQL file from your database that has all database object and the other developers update the SQL file in source control per their changes. This can be done easily by using schema compare tool in visual studio and appending the generated script to SQL file in the source control. With EF DB First approach, you will not have to add many migration classes as in EF Code first.
I have quite old application with current database (on MSSQL but it does not matter). I scripted it completely with all required static data. Now I want to introduce DB change only via update scripts. So each function, each SP will be placed in stand-alone file and all schema update scripts will be stored in files named like 'SomeProduct01_0001' what means that this script belongs to product SomeProduct, sprint 1 and it is first schema update script.
I know that each script must be absolutely re-runnable, but anyway I want to have functionality to combine these scripts into one based on DB version (stored in DB table).
What common best practices there is to handle bunches of update scripts?
What is better - implement version anylyzis in collector
(bat or exe file) or add some SQL header to each file? From other point of view I am already have version - it will consist of sprint identifier and script identifier, not sure that it is ok to duplicate this information in script header.
How to skip file content if user tries to apply it to newer database but keep
availability combine this script with any other to perform updates
of other old database?
How to avoid database conflicts if combined scripts operates columns/table which still does not exists in database but will be created byt this script (for example, in line 10 table created and in line 60 it is used in trigger or constraint, as I know script will not be validated)? Maybe wrap in EXEC('') entire script? What I need to escape besides sigle quote characters?
UPD: As David Tanzer asnwered it is better to use ready solutions for DB migrations, so it may be best solution for cases like mine. It was not and answer exactly for my question, but it is suitable for new solutions.
You don't have to implement this yourself, there are tools that do it. Take a look at dbmaintain, it provides almost exactly the functionality you described:
http://www.dbmaintain.org/overview.html
I know of and have worked with several teams who use it to manage their database schemas in all their environments: Development, Testing, Staging and Production.
http://flywaydb.org/ seems to be another tool to do this, and it has even more features. They even have a comparison of multiple tools on their homepage (including dbmaintain)
The Background
My group has 4 SQL Server Databases:
Production
UAT
Test
Dev
I work in the Dev environment. When the time comes to promote the objects I've been working on (tables, views, functions, stored procs) I make a request of my manager, who promotes to Test. After testing, she submits a request to an Admin who promotes to UAT. After successful user testing, the same Admin promotes to Production.
The Problem
The entire process is awkward for a few reasons.
Each person must manually track their changes. If I update, add, remove any objects I need to track them so that my promotion request contains everything I've done. In theory, if I miss something testing or UAT should catch it, but this isn't certain and it's a waste of the tester's time, anyway.
Lots of changes I make are iterative and done in a GUI, which means there's no record of what changes I made, only the end result (at least as far as I know).
We're in the fairly early stages of building out a data mart, so the majority of the changes made, at least count-wise, are minor things: changing the data type for a column, altering the names of tables as we crystallize what they'll be used for, tweaking functions and stored procs, etc.
The Question
People have been doing this kind of work for decades, so I imagine there have got to be a much better way to manage the process. What I would love is if I could run a diff between two databases to see how the structure was different, use that diff to generate a change script, use that change script as my promotion request. Is this possible? If not, are there any other ways to organize this process?
For the record, we're a 100% Microsoft shop, just now updating everything to SQL Server 2008, so any tools available in that package would be fair game.
I should clarify I'm not necessarily looking for diff tools. If that's the best way to sync our environments then it's fine, but if there's a better way I'm looking for that.
An example doing what I want really well are migrations in Ruby on Rails. Dead simple syntax, all changes are well documented automatically and by default, determining what migrations need to run is almost trivially easy. I'd love if there was something similar to this for SQL Server.
My ideal solution is 1) easy and 2) hard to mess up. Rails Migrations are both; everything I've done so far on SQL Server is neither.
Within our team, we handle database changes like this:
We (re-)generate a script which creates the complete database and check it into version control together with the other changes. We have 4 files: tables, user defined functions and views, stored procedures, and permissions. This is completely automated - only a double-click is needed to generate the script.
If a developer has to make changes to the database, she does so on her local db.
For every change, we create update scripts. Those are easy to create: The developer regenerates the db script of his local db. All the changes are now easy to identify thanks to version control. Most changes (new tables, new views etc) can simply be copied to the update script, other changes (adding columns for example) need to be created manually.
The update script is tested either on our common dev database, or by rolling back the local db to the last backup - which was created before starting to change the database. If it passes, it's time to commit the changes.
The update scripts follow a naming convention so everybody knows in which order to execute them.
This works fairly well for us, but still needs some coordination if several developers modify heavily the same tables and views. This doesn't happen often though.
The important points are:
database structure is only modified by scripts, except for the local developer's db. This is important.
SQL scripts are versioned by source control - the db can be created as it was at any point in the past
database backups are created regularly - at least before making changes to the db
changes to the db can be done quickly - because the scripts for those changes are created relatively easily.
However, if you have a lot of long lasting development branches for your projects, this may not work well.
It is by far not a perfect solution, and some special precautions are to be taken. For example, if there are updates which may fail depending on the data present in a database, the update should be tested on a copy of the production database.
In contrast to rails migrations, we do not create scripts to reverse the changes of an update. But this isn't always possible anyway, at least in respect to the data (the content of a dropped column is lost even if you recreate the column).
Version Control and your Database
The root of all things evil is making changes in the UI. SSMS is a DBA tool, not a developer one. Developers must use scripts to do any sort of changes to the database model/schema. Versioning your metadata and having upgrade script from every version N to version N+1 is the only way that is proven to work reliably. It is the solution SQL Server itself deploys to keep track of metadata changes (resource db changes).
Comparison tools like SQL Compare or vsdbcmd and .dbschema files from VS Database projects are just last resorts for shops that fail to do a proper versioned approach. They work in simple scenarios, but I see them all fail spectacularly in serious deployments. One just does not trust a tool to do a change to +5TB table if the tools tries to copy the data...
RedGate sells SQL Compare, an excellent tool to generate change scripts.
Visual Studio also has editions which support database compares. This was formerly called Database Edition.
Where I work, we abolished the Dev/Test/UAT/Prod separation long ago in favor of a very quick release cycle. If we put something broken in production, we will fix it quickly. Our customers are certainly happier, but in the risk avert corporate enterprise, it can be a hard sell.
There are several tools available for you. One is from Red-Gate called SQL Compare. Awesome and highly recommended. SQL Compare will let you do a diff in schemas between two databases and even build the sql change scripts for you.
Note they have been working on a SQL Server source control product for awhile now as well.
Another (if you're a visual studio shop) is the schema and data compare features that is part of Visual Studio (not sure which versions).
Agree that SQL Compare is an amazing tool.
However, we do not make any changes to the database structure or objects that are not scripted and saved in source control just like all other code. Then you know exactly what belongs in the version you are promoting because you have the scripts for that particular version.
It is a bad idea anyway to make structural changes through the GUI. If you havea lot of data, it is far slower than using alter table at least in SQL Server. You only want to use tested scripts to make changes to prod as well.
I agree with the comments made by marapet, where each change must be scripted.
The problem that you may be experiencing, however, is creating, testing and tracking these scripts.
Have a look at the patching engine used in DBSourceTools.
http://dbsourcetools.codeplex.com
It's been specifically designed to help developers get SQL server databases under source-code control.
This tool will allow you to baseline your database at a specific point, and create a named version (v1).
Then, create a deployment target - and increment the named version to v2.
Add patch scripts to the Patches directory for any changes to schema or data.
Finally, check the database and all patches into source-code control, to distribute with devs.
What this gives you is a repeatable process to test all patches to be applied from v1 to v2.
DBSourceTools also has functionality to help you create these scripts, i.e. schema compare or script data tools.
Once you are done, simply send all of the files in the patches directory to your DBA to upgrade from v1 to v2.
Have fun.
Another "Diff" tool for databases:
http://www.xsqlsoftware.com/Product/Sql_Data_Compare.aspx
Keep database version in a versioning table
Keep script file name that was successfully applied
Keep md5 sum of each sql script that has been applied. It should ignore spaces when calculate md5 sum. Must be effective.
Keep info about who applied a script Keep info about when a script was applied
Database should be verified on application start-up
New sql script should be applied automatically
If md5 sum of a script that was already applied is changed, error should be thrown (in a production mode)
When script have been released it must not be changed. It must be
immutable in a production environment.
Script should be written in a way, so it could be applied to different types of database (see liquibase)
Since most ddl statements are auto-committing on most databases, it is best to have a single ddl statement per SQL script.
DDL sql statement should be run in a way, so it can be executed several times without errors. Really helps in a dev mode, when you may edit script several times. For instance, create a new table, only if it does not exist, or even drop table before creating a new one. It will help you in a dev mode, with a script that has not been released, change it, clear md5 sum for this script, rerun it again.
Each sql script should be run in its own transaction.
Triggers/procedures should be dropped and created after each db
update.
Sql script is kept in a versioning system like svn
Name of a script contains date when it was committed, existing (jira) issue id, small description
Avoid adding rollback functionality in scripts (liquibase allow to do that). It makes them more complicated to write and support. If you use exactly one ddl statement per script, and dml statements are run within a
transaction, even failing a script will not be a big trouble to
resolve it
This is the workflow we have been using succesfully:
Development instance: SQL objects are created/updated/deleted in DB using MSSQL Studio and all operations are saved to scritps we include in each version of our code.
Moving to production: We compare schema between dev and prod db using SQL Schema Compare in Microsoft Visual Studio. We update prod using the same tool.
here's a more general question on how you handle database schema changes in a development team.
We are a team of developers and the databases used during development are running locally on everyone's box as we want to avoid the requirement to have web access all the time. So running a single central database instance somewhere is not a real option.
Whenever one of us decides that it is time to extend/change the db schema, we mail database files (MYI/MYD) or SQL files to execute around, or give others instructions on the phone what they need to do to get the changed code running on their local DBs. That's not the perfect approach for sure. The same problem arises when we need to adjust the DB schema on staging or production once a new release is ready.
I was wondering ... how do you guys handle this kind of stuff? For source code, we use SVN.
Really appreciate your input!
Thanks,
Michael
One approach we've used in the past is to script the entire DDL for the database, along with any test/setup data needed. Store that in SVN, then when there's a change, any developer can pull down the changes, drop the database, and rebuild it from the script files.
At the very least you should have the scripts of all the objects in the database (tables, stored procedures, etc) under source control.
I don't think mailing schema changes is a real option for a professional development team.
We had a system on one of my previous teams that was the best I've encountered for dealing with this situation.
The nightly build of the application included a build of a database (SQL Server). The database got built to the Test DB server. Each developer then had a DTS package (this was a while ago, and I'm sure they upgraded to SSIS packages) to pull down that nightly DB build to their local DB environment.
This kept the master copy in one location and put the onus on the developers to keep their local dev databases fresh.
At my work, we deal with pretty large databases that are time-consuming to generate, so for us, starting from scratch with a new DB isn't ideal. Like Harper, we have our DDL in SVN. Additionally, we store a version number in a database table. Every check-in that changes the DB must be accompanied by a script that:
Will upgrade the database schema and modify any existing data appropriately, and
Will update the version number in the database.
Further, we number the scripts and database versions such that a script we've written knows how to upgrade further along a branch or from an older branch to a newer one without any input from the developer (apart from the database name and the directory to the upgrade scripts).
Thus, if I've got a copy of a customer's 4GB DB that's from a year old version and I want to test how their data will work with the version we cut yesterday, I can just run our script and let it handle the upgrades rather than having to start from scratch and redo every INSERT, UPDATE and DELETE performed since the database was created.
We have a non-SQL description of the database schema. When the application starts, it compares the desired database schema with the actual database schema, and performs whatever ADD TABLE, ADD COLUMN, ADD INDEX, etc. statements it needs to do to get the database to look right.
This doesn't handle every case; sometimes you have to delete the database and recreate if if you've changed something that the schema resolver can't handle, but most of the time we don't need to worry about it.
I'd certainly keep the database schema in source code control.
At my present job, every time there's a schema change, we write the SQL for the change (alter table xyz add column ...) and put it in SVN. Then developers can update test databases by running this script. It's pretty clumsy but it works.
At a previous job I wrote some code that at application start-up would automatically compare the actual database schema to what it expected, and if it was not up to date perform the updates. Mostly this was done for deployment reasons: When we shipped new copies of the software, it would then automatically update the user's database. But it was also handy for developers.
I think there should be some generic SQL tool to do this. Maybe there is, but I've never seen one.