How is liquibase rollback supposed to work? - liquibase

I am just getting started with liquibase and it seems quite useful. My biggest issue is with rollback.
I'm baking my liquibase changelog into the jar that has my data layer in it and on app startup, I'm migrating automatically using the changelog in the jar in the app. If I'm only moving forward, this works fine.
But if I have two branches each working on that data layer jar and I want to switch back and forth between them using the same DB, it doesn't work because the changelog in one branch has different changesets than the other. In and of itself, that isn't a problem, but when I swap branches and start my app, it doesn't know how to rollback the changesets from the other branch because they are not in the changelog yet.
Is the answer here to just be careful? Always use separate DBs?
Why not put rollback into the DATABASECHANGELOG table in the DB so unknown changesets can be rolled back without the changelog file?

You are right that rollback simply looks at the applied changes in the DATABASECHANGELOG table and rolls changeSets back based on what is in the changelog. It could store the rollback info in the DATABAESCHANGELOG table, but it doesn't for a variety of reasons including simplicity, space, and security. There are also times it can be nice to roll back changes based on updated changeSet rollback info rather than what was set when the changeSet was first executed.
In your case, rollback is more complex because you looking to switch branches frequently. What I've generally found is that feature branches tend to make relatively independent changes and so even if you change between branches, you can leave the database changes in because they created new tables or columns which the other code simply ignores. There are definitely times this is not true, but within the development environment you find the problems right away and can address them as needed. When you do find times you need to roll back changes from another branch you can either remember to roll back changes before switching branches. Some groups don't bother with rollback at all and just rebuild their development database when needed (liquibase "contexts" are very helpful for managing test/dev data).
As you move from development to QA and production you normally don't have to deal with the same level of branch changes and so there is normally no difference between the changeSets you are looking to roll back and what is in the changelog.

Related

Inserted data is not shown in Oracle db using a direct query

I was not able to find a solution for this question online, so I hope I can find help here.
I've inherited a Java web application that performs changes to an Oracle database and displays data from it. The application uses a 'pamsdb' user ID. The application inserted a new row in the one of the tables (TTECHNOLOGY). When the application later queries the db, the result set includes the new row (I can see it in print outs and in the application screens).
However, when I query the database directly using sqldeveloper (using the same user id 'pamsdb'), I do not see the new row in the modified table.
A couple of notes:
1) I read here and in other locations that all INSERT operations should be followed by a COMMIT, otherwise other users cannot see the changes. The Java application does not do COMMIT, which I thought could be the source of the problem, but since I'm using the same user ID in sqldeveloper, I'm surprised I can't see the changes there.
2) I tried doing COMMIT WORK from sqldeveloper, but it didn't change my situation.
Can anyone suggest what's causing the discrepancy and how can it be resolved?
Thanks in advance!
You're using the same user, but in a different session. Once session can't see uncommitted changes made in another session, for any user - they are independent.
You have to commit from the session that did the insert - i.e. your Java code has to commit for its changes to be visible anywhere else. You can't make the Java session's changes commit from elsewhere, and committing from SQL Developer - even as the same user - only commits any changes made in that session.
You can read more about connections and sessions, and transactions, and the commit documentation summarises as:
Use the COMMIT statement to end your current transaction and make permanent all changes performed in the transaction. A transaction is a sequence of SQL statements that Oracle Database treats as a single unit. This statement also erases all savepoints in the transaction and releases transaction locks.
Until you commit a transaction:
You can see any changes you have made during the transaction by querying the modified tables, but other users cannot see the changes. After you commit the transaction, the changes are visible to other users' statements that execute after the commit.
You can roll back (undo) any changes made during the transaction with the ROLLBACK statement (see ROLLBACK).
The "other users cannot see the changes" really means other user sessions.
If the changes are being committed and are visible from a new session via your Java code (after the web application and/or its connection pool have been restarted), but are still not visible from SQL Developer; or changes made directly in SQL Developer (and committed there) are not visible to the Java session - then the changes are being made either in different databases, or in different schemas for the same database, or (less likely) are being hidden by VPD. That should be obvious from the connection settings being used by the two sessions.
From comments it seems that was the issue here, with the Java web application and SQL Developer accessing different schemas which both had the same tables.

Best practices to prevent unneeded liquibase changesets in production

So here is the scenario:
release is pushed to production every 2 sprints
dev is writing changesets during a sprint that add table A + 2 columns in table B, commits them, it gets pushed to QA, everything good.
during sprint demo, stakeholder identifies that he needs 2 more fields in table B, and that a whole section needs to be reworked, leading to dropping table A, removing 3 columns in table B and adding 2 new ones in table B
next sprint, developer implements the changes identified in sprint demo (ie adding changesets to drop table that was previously added etc)
=> so now we have changesets that when they are deployed to production are basically going to create a table and then drop it right away. For minimal changes like above this is not really a problem, but for bigger changesets you could end up having the production database transaction log growing unnecessarily. The time to upgrade production might be increased also quite a bit because of those changesets that revert work that did not need to be done in the first place.
Would it be recommended to rework the changesets so that for a release only the required changesets are executed (basically reworking changesets of the 2 sprints)?
Alternatively would you have 2 sets of changesets (one for dev that accumulates all the changes made during the sprints, one for production that minimizes the amount of changes).
Is this all going against the "wrong is right" principle (page 17 of https://docs.google.com/presentation/d/1U8vESZVbj-zFE-K1Vh5dVfiH8xns9Gv9zDzg0DZBcKc/edit?pli=1#slide=id.g119ea23dc_00)
Creating separate changesets for production sort of defeats the whole purpose of tracking migrations. That being said, if you're determined to go down this path, have a look at contexts. You can tag certain changesets as those approved for production, and run your production migration selecting only for those tags.
What I usually recommend is to just not worry about the unneeded changesSets. They may create a table and then drop it again, but the database is really fast at doing that.
Modifying the changeLog can avoid the production database making and dropping unnecessary objects, but in the process your dev databases can easily end up different than production due to changeSets that ran on them but not production. Furthermore, the testing you already did was against the original command flow and that may or may not work with the new and you want no surprises when you deploy to production.
The only time I recommend pulling out changeSets is for unnecessary expensive operations such as a createIndex followed by a dropIndex on a large table.

How to continuously delivery SQL-based app?

I'm looking to apply continuous delivery concepts to web app we are building, and wondering if there any solution to protecting the database from accidental erroneous commit. For example, a bug that erases whole table instead of a single record.
How this issue impact can be limited according to continuous delivery doctorine, where the application deployed gradually over segments of infrastructure?
Any ideas?
Well first you cannot tell just from looking what is a bad SQL statement. You might have wanted to delete the entire contents of the table. Therefore is is not physiucally possible to have an automated tool that detects intent.
So to protect your database, first make sure you are in full recovery (not simple) mode and have full backups nightly and transaction log backups every 15 minutes or so. Now you cannot lose much information no matter how badly the process breaks. Your dbas should be trained to be able to recover to a point in time. If you don't have any dbas, I'd suggest the best thing you can do to protect your data is hire some. This is a non-negotiable in any non-trivial database environment and it is terribly risky not to have trained, experienced dbas if your data is critical to the business.
Next, you need to treat SQL like any other code, it should be in source control in scripts. If you are terribly concerned about accidental deletions, then write the scripts for deletes to copy all deletes to a staging table and delete the content of the staging table once a week or so. Enforce this convention in the code reviews. Or better yet set up an auditing process that runs through triggers. Once all records are audited, it is much easier to get back the 150 accidental deletions without having to restore a database. I would never consider having any enterprise application without auditing.
All SQL scripts without exception should be code-reviewed just like other code. All SQL scripts should be tested on QA and passed before moving to porduction. This will greatly reduce the possiblility for error. No developer should have write rights to production, only dbas should have that. Therefore each script should be written so that is can just be run, not run one chunk at a time where you could accidentally forget to highlight the where clause. Train your developers to use transactions correctly in the scripts as well.
Your concern is bad data happening to the database. The solution is to use full logging of all transactions so you can back out of transactions that you want to. This would usually be used in a context of full backups/incremental backups/full logging.
SQL Server, for instance, allows you to restore to a point in time (http://msdn.microsoft.com/en-us/library/ms190982(v=sql.105).aspx), assuming you have full logging.
If you are creating and dropping tables, this could be an expensive solution, in terms of space needed for the log. However, it might meet your needs for development.
You may find that full-logging is too expensive for such an application. In that case, you might want to make periodic backups (daily? hourly?) and just keep these around. For this purpose, I've found LightSpeed to be a good product for fast and efficient backups.
One of the strategies that is commonly adopted is to log the incremental sql statements rather than a collective schema generation so you can control the change at a much granular levels:
ex:
change 1:
UP:
Add column
DOWN:
Remove column
change 2:
UP:
Add trigger
DOWN:
Remove trigger
Once the changes are incrementally captured like this, you can have a simple but efficient script to upgrade (UP) from any version to any version without having to worry about the changes that happening. When the change # are linked to build, it becomes even more effective. When you deploy a build the database is also automatically upgraded(UP) or downgraded(DOWN) to that specific build.
We have an pipeline app which does that at CloudMunch.

Database Modification Scripts - Rollout / Rollback Best Practice?

I'm looking for insight into best practices regarding database scripting for modifications that go out along side other code changes for a software system.
I used to work for a company that insisted that every roll out has a roll back ready in case of issues. This sounds sensible, but in my opinion, the roll back code for database modifications deployed via scripts has the same likely hood of failing as the roll out script.
For managed code, version control makes this quite simple, but for a database schema, rolling back changes is not so easy - especially if data is changed as part of the roll out.
My current practice is to test the roll out code by running against a test database during late stage development, and then run the application against that test database. Following this, I back up the live DB, and proceed with the roll out.
I'm yet to run into a problem, but am wondering how other shops manage database changes, and what the strategy is for recovering from any bugs.
All of our database scripts go through several test phases against databases that are like our live database. This way we can be fairly certain that the modification scripts will work as expected.
For rolling back, stored procedures, views, functions, triggers, everything programmatic is easy to roll back, just apply the previous version of the object.
Like you mentioned, the challenging part comes when updating / deleting records from tables, or even adding new columns to tables. And you're right that in this case the rollback can be just as likely to fail.
What we do is if we have a change that can't be easily rolled back, but is a sensitive / critical section... is that we have a set rollback scripts that also go through the same testing environments. We run the update script, validate that it works as expected, and then run the rollback script, and validate that it works as it did prior to the modification.
Another thing that we do as just a precaution is to create a database snapshot (SQL Server 2005) prior to an update. That way if there are any unexpected issues, we can use the snapshot to recover any data that was potentially lost during the update.
So the safest course of action is to test against databases that are as close to your live system as possible, and to test your rollback scripts as well... and just in case both of those fail, have a snapshot ready just in case you need it.
SQL Diff (or something like it is always helpful if you are using a test database. It has a lot of checks and balances, safeguards, and ways of restoring or rolling back if there is an issue. Very useful.
http://www.apexsql.com/sql_tools_diff.aspx

How to rollback a database deployment without losing new data?

My company uses virtual machines for our web/app servers. This allows for very easy rollbacks of a deployment if something goes wrong. However, if an app server deployment also requires a database deployment and we have to rollback I'm kind of at a loss. How can you rollback database schema changes without losing data? The only thing that I can think of is to write a script that will drop/revert tables/columns back to their original state. Is this really the best way?
But if you do drop columns then you will lose data since those columns/tables (supposedly) will contain some data. And since I'd assume that any rollbacks often are temporary in that a bug is found, a rollback is made to get it going while that's fixed and then more or less the same changes are re-installed, the users could get quite upset if you lost that data and they had to re-enter it when the system was fixed.
I'd suggest that you should only allow additions of tables and columns, no alterations or deletions, then you can rollback just the code and leave the data as is, if you have a lot of rollbacks you might end up with some unused columns, but that shouldn't happen that often that someone added a table/column by mistake and in that case the DBA can remove them manually.
Generally speaking you can not do this.
However assuming that such a rollback makes sense it implies that the data you are trying to retain is independent from the schema changes you'd like to revert.
One way to deal with it would be to:
backup only data (script),
revert the schema to the old one and
restore the data
The above would work well if schema changes would not invalidate the created script (for example changing number of columns would be tricky).
This question has details on tools available in MS SQL for generating scripts.