How to rollback a database deployment without losing new data? - sql

My company uses virtual machines for our web/app servers. This allows for very easy rollbacks of a deployment if something goes wrong. However, if an app server deployment also requires a database deployment and we have to rollback I'm kind of at a loss. How can you rollback database schema changes without losing data? The only thing that I can think of is to write a script that will drop/revert tables/columns back to their original state. Is this really the best way?

But if you do drop columns then you will lose data since those columns/tables (supposedly) will contain some data. And since I'd assume that any rollbacks often are temporary in that a bug is found, a rollback is made to get it going while that's fixed and then more or less the same changes are re-installed, the users could get quite upset if you lost that data and they had to re-enter it when the system was fixed.
I'd suggest that you should only allow additions of tables and columns, no alterations or deletions, then you can rollback just the code and leave the data as is, if you have a lot of rollbacks you might end up with some unused columns, but that shouldn't happen that often that someone added a table/column by mistake and in that case the DBA can remove them manually.

Generally speaking you can not do this.
However assuming that such a rollback makes sense it implies that the data you are trying to retain is independent from the schema changes you'd like to revert.
One way to deal with it would be to:
backup only data (script),
revert the schema to the old one and
restore the data
The above would work well if schema changes would not invalidate the created script (for example changing number of columns would be tricky).
This question has details on tools available in MS SQL for generating scripts.

Related

How to continuously delivery SQL-based app?

I'm looking to apply continuous delivery concepts to web app we are building, and wondering if there any solution to protecting the database from accidental erroneous commit. For example, a bug that erases whole table instead of a single record.
How this issue impact can be limited according to continuous delivery doctorine, where the application deployed gradually over segments of infrastructure?
Any ideas?
Well first you cannot tell just from looking what is a bad SQL statement. You might have wanted to delete the entire contents of the table. Therefore is is not physiucally possible to have an automated tool that detects intent.
So to protect your database, first make sure you are in full recovery (not simple) mode and have full backups nightly and transaction log backups every 15 minutes or so. Now you cannot lose much information no matter how badly the process breaks. Your dbas should be trained to be able to recover to a point in time. If you don't have any dbas, I'd suggest the best thing you can do to protect your data is hire some. This is a non-negotiable in any non-trivial database environment and it is terribly risky not to have trained, experienced dbas if your data is critical to the business.
Next, you need to treat SQL like any other code, it should be in source control in scripts. If you are terribly concerned about accidental deletions, then write the scripts for deletes to copy all deletes to a staging table and delete the content of the staging table once a week or so. Enforce this convention in the code reviews. Or better yet set up an auditing process that runs through triggers. Once all records are audited, it is much easier to get back the 150 accidental deletions without having to restore a database. I would never consider having any enterprise application without auditing.
All SQL scripts without exception should be code-reviewed just like other code. All SQL scripts should be tested on QA and passed before moving to porduction. This will greatly reduce the possiblility for error. No developer should have write rights to production, only dbas should have that. Therefore each script should be written so that is can just be run, not run one chunk at a time where you could accidentally forget to highlight the where clause. Train your developers to use transactions correctly in the scripts as well.
Your concern is bad data happening to the database. The solution is to use full logging of all transactions so you can back out of transactions that you want to. This would usually be used in a context of full backups/incremental backups/full logging.
SQL Server, for instance, allows you to restore to a point in time (http://msdn.microsoft.com/en-us/library/ms190982(v=sql.105).aspx), assuming you have full logging.
If you are creating and dropping tables, this could be an expensive solution, in terms of space needed for the log. However, it might meet your needs for development.
You may find that full-logging is too expensive for such an application. In that case, you might want to make periodic backups (daily? hourly?) and just keep these around. For this purpose, I've found LightSpeed to be a good product for fast and efficient backups.
One of the strategies that is commonly adopted is to log the incremental sql statements rather than a collective schema generation so you can control the change at a much granular levels:
ex:
change 1:
UP:
Add column
DOWN:
Remove column
change 2:
UP:
Add trigger
DOWN:
Remove trigger
Once the changes are incrementally captured like this, you can have a simple but efficient script to upgrade (UP) from any version to any version without having to worry about the changes that happening. When the change # are linked to build, it becomes even more effective. When you deploy a build the database is also automatically upgraded(UP) or downgraded(DOWN) to that specific build.
We have an pipeline app which does that at CloudMunch.

Database Modification Scripts - Rollout / Rollback Best Practice?

I'm looking for insight into best practices regarding database scripting for modifications that go out along side other code changes for a software system.
I used to work for a company that insisted that every roll out has a roll back ready in case of issues. This sounds sensible, but in my opinion, the roll back code for database modifications deployed via scripts has the same likely hood of failing as the roll out script.
For managed code, version control makes this quite simple, but for a database schema, rolling back changes is not so easy - especially if data is changed as part of the roll out.
My current practice is to test the roll out code by running against a test database during late stage development, and then run the application against that test database. Following this, I back up the live DB, and proceed with the roll out.
I'm yet to run into a problem, but am wondering how other shops manage database changes, and what the strategy is for recovering from any bugs.
All of our database scripts go through several test phases against databases that are like our live database. This way we can be fairly certain that the modification scripts will work as expected.
For rolling back, stored procedures, views, functions, triggers, everything programmatic is easy to roll back, just apply the previous version of the object.
Like you mentioned, the challenging part comes when updating / deleting records from tables, or even adding new columns to tables. And you're right that in this case the rollback can be just as likely to fail.
What we do is if we have a change that can't be easily rolled back, but is a sensitive / critical section... is that we have a set rollback scripts that also go through the same testing environments. We run the update script, validate that it works as expected, and then run the rollback script, and validate that it works as it did prior to the modification.
Another thing that we do as just a precaution is to create a database snapshot (SQL Server 2005) prior to an update. That way if there are any unexpected issues, we can use the snapshot to recover any data that was potentially lost during the update.
So the safest course of action is to test against databases that are as close to your live system as possible, and to test your rollback scripts as well... and just in case both of those fail, have a snapshot ready just in case you need it.
SQL Diff (or something like it is always helpful if you are using a test database. It has a lot of checks and balances, safeguards, and ways of restoring or rolling back if there is an issue. Very useful.
http://www.apexsql.com/sql_tools_diff.aspx

Recommended approach how to modify schema of a production SQL database?

Say there is a database with 100+ tables and a major feature is added, which requires 20 of existing tables to be modified and 30 more added. The changes were done over a long time (6 months) by multiple developers on the development database. Let's assume the changes do not make any existing production data invalid (e.g. there are default values/nulls allowed on added columns, there are no new relations or constraints that could not be fulfilled).
What is the easiest way to publish these changes in schema to the production database? Preferably, without shutting the database down for an extended amount of time.
Write a T-SQL script that performs the needed changes. Test it on a copy of your production database (restore from a recent backup to get the copy). Fix the inevitable mistakes that the test will discover. Repeat until script works perfectly.
Then, when it's time for the actual migration: lock the DB so only admins can log in. Take a backup. Run the script. Verify results. Put DB back online.
The longest part will be the backup, but you'd be crazy not to do it. You should know how long backups take, the overall process won't take much longer than that, so that's how long your downtime will need to be. The middle of the night works well for most businesses.
There is no generic answer on how to make 'changes' without downtime. The answer really depends from case to case, based on exactly what are the changes. Some changes have no impact on down time (eg. adding new tables), some changes have minimal impact (eg. adding columns to existing tables with no data size change, like a new nullable column that doe snot increase the null bitmap size) and other changes will wreck havoc on down time (any operation that will change data size will force and index rebuild and lock the table for the duration). Some changes are impossible to apply without *significant * downtime. I know of cases when the changes were applies in parallel: a copy of the database is created, replication is set up to keep it current, then the copy is changed and kept in sync, finally operations are moved to the modified copy that becomes the master database. There is a presentation at PASS 2009 given by Michelle Ufford that mentions how godaddy gone through such a change that lasted weeks.
But, at a lesser scale, you must apply the changes through a well tested script, and measure the impact on the test evaluation.
But the real question is: is this going to be the last changes you ever make to the schema? Finally, you have discovered the perfect schema for the application and the production database will never change? Congratulation, once you pull this off, you can go to rest. But realistically, you will face the very same problem in 6 months. the real problem is your development process, with developers and making changes from SSMS or from VS Server Explored straight into the database. Your development process must make a conscious effort to adopt a schema change strategy based on versioning and T-SQL scripts, like the one described in Version Control and your Database.
Use a tool to create a diff script and run it during a maintenance window. I use RedGate SQL Compare for this and have been very happy with it.
I've been using dbdeploy successfully for a couple of years now. It allows your devs to create small sql change deltas which can then be applied against your database. The changes are tracked by a changelog table within database so that it knows what to apply.
http://dbdeploy.com/

Access sql that triggered the trigger from within trigger (Sybase)

Is there a way to access the sql that triggered a trigger from within the trigger? I've managed to get it by joining to the master..monProcessSQLText MDA table but this only works for users with the mon_role and I don't want to give that to everyone. Is there a global variable I've missed?
I'm trying to log all the updates run against a table so I can trace it back to an IP address and username.
This is with ASE 12.5.
If you are trying to
log all the updates run against a table so I can trace it back to an IP address and username
A trigger is definitely the wrong way to go about it, triggers were not designed for that, and there are other ASE facilities which were designed for that. It is not about the table, it is about security and monitoring in general.
Sybase Auditing.
It takes a bit of setting up, much less overhead than MAD tables; but most important, it was designed for auditing (MDA was not). And there is no coding requirements such as for MDA. It is highly configurable, the idea is to capture only what you need, and not more.
Monitoring.
I would not recommend MDA tables, but since you have them in place, and you have enabled monitoring, and accepted the 22% overhead for capturing SQL text... The info is very transient. In order to use them for any relevant purpose, such as yours, you need to write a capture-and-store mechanism, archiving all required info to an archive database. This has to be done on an ongoing basis, and completely independent of a trigger, etc. You can also filter on the fly to reduce the volume of data stored (warning, it is huge). purge data over 7 days old, etc. It is a little project in itself, that is why there are commercially available from 3rd parties.
Once either of these facilities are in place, then, separately, whenever you wish to inquire about who updated a table, when and from where, all you need to do is inspect the archive. nothing to do with a trigger, or difficulties getting the info from a trigger, or giving admin privileges to ordinary users.
Also, it needs to be appreciated that you do not have normal security in place, the tables are being updated directly by users; thus direct update permissions have been granted to either specific users, or worse, all users. The consequence is, there is no way of knowing who is updating the table, and who is breaking the data or referential integrity.
The secure method is to place the entire transaction in a stored proc, thus eliminating the possibility of incomplete transactions (as well as improving execution speed); and to grant permissions on the procs, not the tables, thus eliminating direct updates. Over time, you may wish to implement security in the server, so that the consequences do not have to be chased down and closed one by one, a process with no finite end.
As far as Auditing goes, if security were in place, then the auditing burden is also substantially reduced: you need to audit stored proc executions only. Otherwise, you need to audit all updates to all tables.

Database safety: Intermediary "to_be_deleted" column/table?

Everyone has accidentally forgotten the WHERE clause on a DELETE query and blasted some un-backed up data once or twice. I was pondering that problem, and I was wondering if the solution I came up with is practical.
What if, in place of actual DELETE queries, the application and maintenance scripts did something like:
UPDATE foo SET to_be_deleted=1 WHERE blah = 50;
And then a cron job was set to go through and actually delete everything with the flag? The downside would be that pretty much every other query would need to have WHERE to_be_deleted != 1 appended to it, but the upside would be that you'd never mistakenly lose data again. You could see "2,349,325 rows affected" and say, "Hmm, looks like I forgot the WHERE clause," and reset the flags. You could even make the to_be_deleted field a DATE column, so the cron job would check to see if a row's time had come yet.
Also, you could remove DELETE permission from the production database user, so even if someone managed to inject some SQL into your site, they wouldn't be able to remove anything.
So, my question is: Is this a good idea, or are there pitfalls I'm not seeing?
That is fine if you want to do that, but it seems like a lot of work. How many people are manually changing the database? It should be very few, especially if your users have an app to work with.
When I work on the production db I put EVERYTHING I do in a transaction so if I mess up I can rollback. Just having a standard practice like that for me has helped me.
I don't see anything really wrong with that though other than ever single point of data manipulation in each applicaiton will have to be aware of this functionality and not just the data it wants.
This would be fine as long as your appliction does not require that the data is immediately deleted since you have to wait for the next interval of the cron job.
I think a better solution and the more common practice is to use a development server and a production server. If your development database gets blown out, simply reload it. No harm done. If you're testing code on your production database, you deserve anything bad that happens.
A lot of people have a delete flag or a row status flag. But if someone is doing a change through the back end (and they will be doing it since often people need batch changes done that can't be accomplished through the front end) and they make a mistake they will still often go for delete. Ultimately this is no substitute for testing the script before applying it to a production environment.
Also...what happens if the following query gets executed "UPDATE foo SET to_be_deleted=1" because they left off the where clause. Unless you have auditing columns with a time stamp how do you know which columns were deleted and which ones were done in error? But even if you have auditing columns with a time stamp, if the auditing is done via a stored procedure or programmer convention then these back end queries may not supply information letting you know that they were just applied.
Too complicated. The standard approach to this is to do all your work inside a transaction, so if you screw up and forget a WHERE clause, then you simply roll back when you see the "2,349,325 rows affected" result.
It may be easier to create a parallel table for deleted rows. A DELETE trigger (and UPDATE too if you want to undo changes as well) on the original table could copy the affected rows to the parallel table. Adding a datetime column to the parallel table to record the date & time of the change would let you permanently remove rows past a certain age using your cron job.
That way, you'd use normal DELETE statements on the original table, so there's no chance you'll forget to run your special "DELETE" statement. You also sidestep the to_be_deleted != 1 expression, which is just a bug waiting to happen when someone inevitably forgets.
It looks like you're describing three cases here.
Case 1 - maintenance scripts. Risk can be minimized by developing them and testing them in an environment other than your production box. For quick maintenance, do the maintenance in a single transaction, and check everything before committing. If you made a mistake, issue the rollback command. For more serious maintenance that you can't necessarily wait around for, or do in a single transaction, consider taking a backup directly before running the maintenance job, so that you can always restore back to the point before you ran your script if you encounter serious problems.
Case 2 - SQL Injection. This is an architecture issue. Your application shouldn't pass SQL into the database, access should be controlled through packages / stored procedures / functions, and values that are going to come from the UI and be used in a DDL statement should be applied using bind variables, rather than by creating dynamic SQL by appending strings together.
Case 3 - Regular batch jobs. These should have been tested before being deployed to production. If you delete too much, you have a bug, and are going to have to rely on your backup strategy.
Everyone has accidentally forgotten
the WHERE clause on a DELETE query and
blasted some un-backed up data once or
twice.
No. I always prototype my DELETEs as SELECTs and only if the latter gives the results I want to delete change the statement before WHERE to a DELETE. This let's me inspect in any needed detail the rows I want to affect before doing anything.
You could set up a view on that table that selects WHERE to_be_deleted != 1, and all of your normal selects are done on that view - that avoids having to put the WHERE on all of your queries.
The pitfall is that it's unnecessarily complicated and someone will inadvertently forget too check the flag in their query. There's also the issue of potentially needing to delete something immediately instead of wait for the scheduled job to run.
To avoid the to_be_deleted WHERE clause you could create a trigger before the delete command fires off to insert the deleted rows into a separate table. This table could be cleared out when you're sure everything in it really needs to be deleted, or you could keep it around for archive purposes.
You also get a "soft delete" feature so you can give the(certain) end-users the power of "undo" - there would have to be a pretty strong downside in the mix to cancel the benefits of soft deleting.
The "WHERE to_be_deleted <> 1" on every other query is a huge one. Another is once you've ran your accidentally rogue query, how will you determine which of the 2,349,325 were previously marked as deleted?
I think the practical solution is regular backups, and failing that, perhaps a delete trigger that captures the tuples to be axed.
The other option would be to create a delete trigger on each table. When anything is deleted, it would insert that "to be deleted" record into another table, ideally named TABLENAME_deleted.
The downside would be that the db would have twice as many tables.
I don't recommend triggers in general, but it might be what you are looking for.
This is why, whenever you are editing data by hand, you should BEGIN TRAN, edit your data, check that it looks good (for instance that you didn't delete more data than you were expecting) and then END TRAN. If you're using Postgres then you want to create lots of savepoints as well so that a typo doesn't wipe out your intermediate work.
But that said, in many applications it does make sense to have software mark records as invalid rather than deleting them. Add a last_modified date that is automatically updated, and you are all prepared to set up incremental updates into a data warehouse. Even if you don't have a data warehouse now, it never hurts to prepare for the future when preparing is cheap. Plus in the event of manual mistakes you still have the data, and can just find all of the records that got "deleted" when you made your mistake and fix them. (You should still use transactions though.)