Get my database under Version Control using a DVCS [Mercurial] - sql

What would be the best approach for versioning my whole database ?
Creating a file for each database object (table,view,procedsure..) or rather having one file for all DDL scripts and any new change will be put in a separate file ?
What about handling changes made in a Database manager tool ?
I'd like to have a generic solutions for any kind of RDBMS.
Are there any other options ?

I'm a huge VCS fan in general and a big Mercurial booster, but I really think you're going down the wrong path.
VCSs aren't just about iterative changes, the "what", they're also about answering the "who", "when", and "why". For a database those answers are a lot less interesting or hard to provide to the VCS. If you're doing nightly exports and commits the "who" will always be "cron" and the "why" will always be "midnight".
The other thing modern VCSs do really well is helping you merge changes from multiple branches. That's less applicable in the database world. Very seldom do you say "I want this table structure, but this data", and if you do the text/diff merge isn't going to help you much.
The thing that does do "what" and "when" very well is an incremental backup system, and that's probably the better fit.
At work we use Tivoli and at home I use rdiff-backup and duplicity, but there are plenty of great options.
I guess my general rule of thumb is "if it was typed by hand by a human then it does into source control, and if it was generated/exported then it goes in the incremental backups"
Certainly you can make this work, but I don't think it will buy you much over the more traditional backup solutions.

Have a look at this post

If you need generic solution - put everything in the scripts (simple text files) and put under Version Control system (can be used any of VCS).
Grouping similar database objects into scripts will be depend on your requirement.
So you may for example:
Store table/indexes/ in one or several script
Each procedure store in individual script or combine small procedures into one script.
However need to remember one important thing with this approach: don't forget change scripts if you changed table/view/procedure directly in databases and don't create/recreate/compile you db objects in database after changing scripts.

SQL Source Control currently supports SVN and TFS, but Mercurial requests are increasing rapidly and we're hoping to have a story for this very soon.
We use UserVoice to measure demand so please vote accordingly if you're interesting in this: http://redgate.uservoice.com/forums/39019-sql-source-control

Related

Best way to migrate data from Access to SQL Server

The problem
Ok, sorry that my question is somewhat abstract and subjective, but will try to make it as specific as possible. So, the situation I am in is simple - I am remaking a very old MS Access application on a new website using ASP.NET MVC. As currently the MVC site is using SQL Server 2008 (for many well known reasons) I need to find a way to migrate the tables AND the data, because the information in the old database will be used in the new application.
Alright, so far so good, however there are a few problems. The old application is written in a different language, meaning that I want to translate table names, field names, and all other names that are there to English. Furthermore, I will be making some changes on the models themselves (change the type of some fields, add additional fields to some tables, remove old unnecessary ones and more). So technically I'll be 'having my way' with everything.
Researched solutions
With those things in mind I researched for the ways to migrate data from Access database to a SQL Server. Of course, there is a lot of information on the matter, in Stack Overflow alone there are more than a few questions and solutions. So why am I struggling to find the answer ? Well I found a few solutions that will be sufficient to some extend (actually will definitely solve my problems) but I am writing to ask if someone experienced has a better perspective on it than I do. Alright, the solutions and why I am still looking for advice: /I'll be listing just a couple of the most common and popular ones that I found, many of the others share the same capabilities and/or results /
Upsize Wizzard (Access) - this is a tool devised specifically for migrating tables and data from Access. It is my most favourite one for the moment as I find it kind of straightforward to work with and it provides good overall results. I was able to migrate the tables to SQL Server (along with the data of course) which more or less is what I am intending to do. It is fast, it seems like it allows you to migrate indexes, primary keys and even to my knowledge foreign keys (table relationships). The downsides of this tool, however, include that it ignores your queries (which I don't really need honestly) and it doesn't provide a way to change the model, names or types of the properties of the table you migrate - which is the thing I kind of prefer, because I will have to make more than a few changes, adding, renaming, deleting, etc. And then continue with the development process (of the application) which will lead to a few additional minor changes. And finally I would need to apply all changes (migration + all changes) on the production server, which overall is prone to mistakes as I will be doing it by hand (and there are more than a few tables).
SQL Server Migration Assistant (SSMA) - ok, this is a separate tool (not included in Access) with again the same idea - to migrate data from Access to ... possibly everywhere, haven't researched that. Overall it offers more functionality and customizing from the Upsize Wizard, but of course it does it in a more complicated way. I haven't put enough effort to make a migration with this tool yet, as it involves a lot of installations and additional work, but according to my research it provides almost all (if not all) of the functionality I require. The downside however comes with the naming. As I mentioned it allows you to apply changes on the tables, schema, fields, indexes, keys and probably everything, but the articles advice that I change the names in Access first, as it will be easier and the migration process will run more smoothly. I am not allowed to make changes on the original Access database, as it will remain functional until the publish of the 'renewed' project, and the data inside it is being used, so a mere copy of the file is a solution I am not particularly fond of, because I might loose new records. Also I cant predict the changes I would want to make in the development process (as I said I believe I would want/need to apply some additional changes later on when I find 'weaknesses' in my data design in the development process) so I find it to be a little half baked solution.
Conclusion
The options presented, the way I see them, are two:
Use the Upsize Wizard to migrate the access tables, then write a script that applies the changes I want to make. Then in the development process add any additional changes to the script. When ready to publish on the production server, reapply the migration with the wizard, run the changes script and pray everything is fine.
Get more involved with the SSMA tool and try producing an updated version of the tables with the migration process. (See how efficient the renaming is and decide whether to use copied file to rename and then find a way to migrate only new records or do it all in the SSMA). Then again write a script for the changes that occur in the development process and re-do and apply it all on the production server when ready and then pray everything is fine.
Option I have not yet seen, apply it and then pray everything is fine.
I have researched the matter for a couple of days now, and found a few more solutions that I do not believe are better by the mentioned. However I include the possibility of missing the 'big red X on the map', a practical and easy solution which seems like it was designed specifically for me (though I doubt that a little). Anyway, reducing all the madness that I have written so far to a few simple questions will look like:
Is anyone aware if my conclusions are correct? I am leaning towards option one as it is easier to accomplish.
Has anyone experienced/found a better way to do that, or just found some 'logic-leaps' in my writings as I am overthinking the entire thing a little and may be doing some obvious miscalculation.
Very sorry for asking a trivial question and one that includes decision making that may involve deeper understanding of my project and situation, yet I am working with rather sensitive data and would appreciate feedback, even if only to improve my confidence into the chosen approach.
There is one other tool/method you might want to consider that seems to cater to your specific needs more. This would be to use the data import/export tool that ships with sqlserver to do a complete copy of all data into a temporary location within sql server and then write custom queries to reorganize the names and other changes you want to make. Is a bit more work but you could use the end product as a seed method for your migrations ;) (if you are doing code first anyway)

JSON vs classic schema design [duplicate]

The Project
I've been asked to work on an interesting project -- what amounts to a basic Web CMS -- that uses HTML/CSS/jQuery with PHP. However, one requirement is that there won't be a database to house the data (they want flat files for the documents/pages -- preferable in JSON format).
In a very basic sense, it'll be used to generate HTML pages via a very "non-techie" interface. Each installation would only have around 20 pages, but a few may get up to 100. It has to be fairly easy to drop onto a PHP capable server and run, with very little setup needed.
What's Out There
There are tons of CMS options and quite a few flat file versions. But an OSS or other existing CMS is not an option. They need a simple propriety system.
Initial Thoughts
So flat files it is... but I'd really like to get some feedback on the drawbacks, and if it is worth the effort to try and convince them to use something like MySQL (SQLite or CouchDB are out since none of the servers can be configured to run them at the present time).
Of course the document files are pretty straightforward, but we're also talking about login info for 1 or 2 admins per installation, a few lists, as well as configs/settings (which also can easily be stored in a file with protection).
The Dilemma
If there are benefits to using MySQL rather than JOSN formatted files and some arrays in a simple project like this -- beyond my own pre-conceived notions :) -- I'll be sure to argue them.
But honestly I can't see any that outweigh their need to not have a database system.
I'd appreciate you insight and opinions.
If you can't cite a specific need for relational table design, then you're good with flat files. Build as specified. The moment you can cite a specific need, let them know; upgrading isn't that hard, if you're perception is timely (that is, if you aren;t in the position of having to normalize data that should have been integrated earlier).
It's a shame you can't use CouchDB, this seems like the perfect application for it. Keep in mind that using flat-files severely constrains your architecture and, especially, scalability.
What's the best case scenario for your CMS app? It's successful and people want to use it more? If you're using flat-files it'll be harder to service and improve your system (e.g. make it more robust, and add new features for future versions) and performance will not scale well. So "success" in this case is at best short-lived, as success translates into more and more work for less and less gains in feature-set and performance.
Then again, if the CSM is designed right, then switching between a flat file to RDMS should be as simple as using a different data access file.
Will this be installed on any shared hosting sites. For this to work somewhat safely, a mechanism like suEXEC needs to be set up properly as the web server will need write permissions to various directories.
What would be cool with a simple site that was feed via JSON and jQuery is that the site wouldn't need to load on each click. Just the relevant data would change. You could then use hashes in the location bar to keep track of where you were (ex. http://localhost/#about)
The problem being if they are editing the raw JSON file they can mess it up pretty quick. I think your admin tools would have to generate the JSON files based on the input so that you can ensure nothing breaks. The admin tools would be more entailed then the site (though isn't that always the case with dynamic sites)
What is the predicted data sizes for the CMS?
A large reason for the use of a RDMS is quick,specific access to large amounts of data. The data format might not be large, but if there is a lot of the data, then it might be better in the long run for a RDMS.
Then again, if the CSM is designed right, then switching between a flat file to RDMS should be as simple as using a different data access file.
While an RDBMS may be necessary for a very large CMS, a small one could run off flat files very well. A lot of CMS products out there fall down in that regard, I think, by throwing an RDBMS into the mix when there's no real need.
However, if you are using flat files, there are security issues which others have highlighted. Another issue I've come across is hosting providers using the disable_functions directive in php.ini to disable file I/O functions like fopen() and friends. If you're hosting your CMS on a box you control, you won't have this problem but if you're using a third-party provider, check first.
As the original poster, I wasn't signed in, so I'm following up to the answers so far in an answer (sorry if this is bad form).
There may instances where this is on
a shared host.
Though the JSON files can technically
be edited, this won't be the case.
The admin interface will be robust
enough to do all of the creating/editing of pages
The size for each install will be
relatively small -- 1 - 2 admins,
10-100 pages. A few lists of common
items may run longer (snippets of
copy for example).
Security will be a big issue -- any
other options suggestions on this
specifically?
Well, isn't there a problem with they being distrustful to any database system? Isn't the problem more in their thinking than in technology? Maybe they are afraid of database because it sounds complex to them. In that case, if you just present them some very simple CMS (like CMS made simple, which I've heard is really simple and the learning process is very fast), if they see everything is easy then may be they just don't care what's behind, if it's a database or whatever!
They could hear to arguments like better maintenance, lower cost of maintenance, much better handover to another webmaster than proprietary solutions (they are not dependent on you) etc.

Tool for synchronising database changes from development database to production?

This may be a pipe dream, but I'm hoping someone knows of a tool which can be configured to compare all or some (keys) of the data in two identical database and merge, perhaps based on relationships.
Specifically looking for one for SQL Server.
I'm not really asking for the best one, but if it exists it would be nice to hear how it is used.
Any other ideas for how to manage the work done or data added in dev and push it out to production without copying the entire database are welcome.
Thanks!
We use this and personally think it's excellent.
http://www.red-gate.com/products/sql-development/sql-data-compare/
There is also another product for the schema side.
http://www.red-gate.com/products/sql-development/sql-compare/
I don't know of a specific tools but you can implement in your process of publications the analysis and executions of delta files, containing the diffs from one verision to another. Magento, Wordpress are using something like this for example. They have something like this
//sql_update_001_002.sql
UPDATE some_table...
DELETE some entries
CREATE a_new_table...
// compere some keys or do other logic.
//etc
Then they have a script that analyses the current version and if needed it executes the corresponding sql.
Navicat allows to make data and structure synchronization between 2 databases (also located on different servers).
In terms of tools I agree with Chris - Redgate's toolset for both schema and data comparisons
If you are also thinking about your overall db dev process - then I have written a blog on the topic which might be of interest.
It also has some links to how others have tackled this subject.
http://michaelbaylon.wordpress.com/category/data-management/database-development/sql-script-management/

Scripting your database first versus building the database via SQL Server Management Studio and then generating the script

I had a (friendly but heated) argument with my lead developer the other day because our project has TSQL Scripts that I code directly into SQL files which I then run against the database. I find that when I do this, it's easy to work out the schema in advance without fiddly pointing and clicking and then there's no opportunity to forget to generate a script to put into source control as generating the script no longer becomes a chore you have to do after the fact, but is an implicit part of the process (and also leads to cleaner scripts without the extra crap that SQL Server Management Studio inserts into the scripts it generates).
My lead developer insists that having to manually script it out is a pain in the arse and that he absolutely refuses to write his scripts by hand when there are perfectly good tools to do it without coding. I've noticed that the copying of his changes into the actual scripts tends to get delayed a bit as a result though.
What are your thoughts on the pros and/or cons of doing it one way vs the other? Am I being too rigid/old-school in my sticking to hand coding schema scripts or is he being too reliant on third party tools and losing something in the process?
I always script stuff myself because the wizards sometimes don't script things in a way that I like it and will also give funky names to defaults
scripting things yourself is also good in case you get laid off and you have to go for an interview where they ask you to script DDL on the whiteboard
As I usually collaborate with a colleague during the schema design, I tend to design the schema using the GUI tools, as its easier to discuss it with a diagram of the tables in front of you. I then generate the scripts, being careful to select the exact options that I want to avoid having to make manual changes post-export.
I think a decision on the relative merits of the two approaches might take into account factors such as
the frequency of changes to the schema
the frequency with which changes need to be propagated to other schemas (test, user acceptance, production, clients * n, etc)
the degree to which the schema may vary across development branches
how well-known in advance your various changes can be scheduled
whether or not you can generate SQL "diff" scripts between schemas.
On balance, I tend to prefer to work with a script for each change (or "migration"). It lets me resequence change releases as priorities shift.
Just because you can create tables in a graphical tool doesn't necessarily mean you should.
I find its as quick to write a script as it is to use SQLMS. You still have to type names in SQLMS, and the time spent moving from keyboard and mouse could be used writing the proper script anyway.
The two of you are almost working with two sets of code. Consistency seems to be a key factor on these types of decisions. In your case, if you create a script, your boss uses the gui to add a field, how do you stay in sync? You can't use your script to rebuild the table without editing it (Chance for error.).
Maybe he should pull rank and force you to format your scripts the same way the GUI creates them - just kidding.
I think you should flip on it..........

Why use database migrations instead of a version controlled schema

Migrations are undoubtedly better than just firing up phpMyAdmin and changing the schema willy-nilly (as I did during my php days), but after using them for awhile, I think they're fatally flawed.
Version control is a solved problem. The main function of migrations is to keep a history of changes to your database. But storing a different file for each change is a clumsy way to track them. You don't create a new version of post.rb (or a file representing the delta) when you want to add a new virtual attribute -- why should you create a new migration when you want to add a new non-virtual attribute?
Put another way, just as you check post.rb into version control, why not check schema.rb into version control and make the changes to the file directly?
This is functionally the same as keeping a file for each delta, but it's much easier to work with. My mental model is "I want table X to have such and such columns (or really, I want model X to have such and such properties)" -- why should you have to infer from this how to get there from the existing schema; just open up schema.rb and give table X the right columns!
But even the idea that classes wrap tables is an implementation detail! Why can't I just open up post.rb and say:
Class Post
t.string :title
t.text :body
end
If you went with a model like this, you'd have to make a decision about what to do with existing data. But even then, migrations are overkill -- when you migrate data, you're going to lose fidelity when you use a migration's down method.
Anyway, my question is, even if you can't think of a better way, aren't migrations kind of gross?
why not check schema.rb into version control and make the changes to the file directly?
Because the database itself is not in sync with version control.
For instance, you could be using the head of the source tree. But you're connecting to a database that was defined as some past version, not the version you have checked out. The migrations allow you to upgrade or downgrade the database schema from any version and to any version, incrementally.
But to answer your last question, yes, migrations are kind of gross. They implement a redundant revision control system on top of another revision control system. However, neither of these revision control systems is really in sync with the database.
Just to paraphrase what others have said: migrations allow you to protect the data as your schema evolves. The notion of maintaining a single schema.rb file is attractive only until your app goes into production. Thereafter, you'll need a way to migrate your existing users' data as your schema changes.
There are also data-related issues that are important to consider, which migrations solve.
Say an old version of my schema has a feet and inches column. For efficiency purposes, I want to combine that into just an inches column to make sorting and searching easier.
My migration can combine all of the feet and inches data into the inches column (feet * 12 + inches) while it's updating the database (i.e. just before it removes the feet column)
Obviously this being in a migration makes it automatically work when you later apply the changes to your production database.
As it stands, they're annoying and inadequate but quite possibly the best option we have available to us at present. Quite a few smart people have spent quite a lot of time working on the problem and this, so far, is about the best they've been able to come up with. After about 20 years of mostly hand-coding database version updates, I came very rapidly to appreciate migrations as a major improvement when I found ActiveRecord.
As you say, version control is a solved problem. Up to a point I'd agree: it's very solved for text files in particular, less so for other file types and not really very much at all for resources such as databases.
How do migrations look if you view them as version control deltas for databases? They're the sum of the deltas you have to apply to get a schema from one version to another. I'm not aware that even git, for all its super-powerfulness, can take two schema files and generate the necessary DDL to do that.
As far as declaring table content in the model, I believe that's what DataMapper does (no personal experience). I think there may be some DDL inference capabilities there as well.
"even if you can't think of a better way, aren't migrations kind of gross?"
Yes. But they're less gross than anything else we have. Do please let us know when you've completed the non-gross alternative.
I suppose given "even if you can't think of a better way", then yes, in the grand scheme of things, migrations are kind of gross. So are Ruby, Rails, ORMs, SQL, web apps, ...
Migrations have the (not insignificant) advantage that they exist. Gross-but-exists tends to win out over Pleasant-but-nonexistent. I'm sure there probably are pleasant and nonexistent ways to migrate your data, but I'm not sure what that means. :-)
OK, I'm going to take a wild guess here and say that you're probably working all by yourself. In a group development project the power of each individual to take responsibility for just his/her changes to the database required for the code that developer is writing is much much more important.
The alternative is that larger groups of programmers (e.g. 10-15 Java developers where I work) end up relying on a couple of dedicated full time database administrators to do that along with their other maintenance, optimization, etc. duties.