Better understanding how Liquibase execute change sets - liquibase

I want to understand better how Liquibase executes change sets.
1)
a) For example I have a change log with 4 change sets and I execute updateDatabase (http://www.liquibase.org/documentation/ant/updatedatabase_ant_task.html).
Liquibase will execute 4 change sets.
b) If I run the same change log once again Liquibase not execute any set.
c) If I will add a new change set to the change log and will run the change log Liquibase will execute the new change set only.
Questions:
How Liquibase knows what change sets to execute?
How Liquibase knows what change sets already executed?
2) How change set ID is important? Can I change it after a change log execution?
3) How change set author is important? Can I change it after a change log execution?
4) What happens if I will execute the rollbackDatabase (http://www.liquibase.org/documentation/ant/rollbackdatabase_ant_task.html)?
How Liquibase knows what change sets to rollback?
a) What happens if I will execute the rollback after 1 a).
Will Liquibase call to rollback element that is located in each change sets (4 rollback elements)?
b) What happens if I will execute the rollback after 1 b).
How Liquibase will know not to call to any to rollback element?
c) What happens if I will execute the rollback after 1 c).
Will Liquibase call to rollback element of only the new change set?

I can answer a few questions, perhaps not all though.
c. - Liquibase creates 2 new tables in the database when you do the first update. The main table is DATABASECHANGELOG, and that is used to keep track of what change sets have been applied to the database. Liquibase uses a couple of ways to identify each changeset - the id, author, and path are used as a composite key. Liquibase also generates a checksum of each changeset that is used to tell whether the changeset has been altered after being applied to the database.
and 3. Because the change set id and author are used as part of the primary key, if you deploy, then change either one of those, you may run into unexpected behavior on subsequent deploys. I think that the id and author are also part of the checksum calculation, so that might affect things also. I would recommend that you do not change those after deploying.
Rollback uses the same mechanism to know what change sets to roll back. When you roll back you have to specify in some way what changes to undo - see this page for more info: http://www.liquibase.org/documentation/rollback.html
The rollback identification mechanisms are by tag (which means you have to apply tags when you deploy), by date (Liquibase keeps track of when each changeset was deployed), or by number (which implicitly uses the date/time of when each changeset was deployed).

Related

What all the steps Liquibase follows while rolling back?

I don't understand the detailed steps when rolling back using Liquibase.
I had scenario like 6 changesets and for one changeSet rollback was not defined - that is, only <rollback/> within the changeset.
After executing using deployIT I could see 7 entries in Databasechangelog table, 6 for added, one for tag creation due usage of deployIT.
After rolling back I saw the behavior of removing all newly added 6 changesets even though one of the six changesets had an empty rollback tag.
Please any expert tell me why? What is the exact behavior of rollback?
Overall want to know when records from Databasechangelog removed ?
When running rollback, liquibase finds the changeSets to roll back, and then checks for a <rollback> tag in each describing how to roll the changeSet back.
If there is no <rollback> tag, then Liquibase checks if the changes in the changeSet have built-in logic on how to roll themselves back. Like gile pointed out, if there is enough information in the change to undo it (like how the createTable change has the table name needed to drop the table) it will be able to still roll them back.
But if there isn't enough information in the change (like how a dropTable doesn't have the information needed to re-create the table) then the rollback command will fail with a "cannot roll back" error.
So the rollback logic is:
Use what is defined in a block
If no rollback block, try to deduce what is needed
If there isn't enough information to roll back, exit before rolling back
If you specify an empty rollback block, you are telling Liquibase "the logic needed to roll this back is to do nothing", so Liquibase happily runs your no-op rollback command and marks the changeSet as rolled back.
Depending on your change sets, may be you fall in case of statements having rollback commands generated automatically, as for Liquibase Rollback documentation:
Many refactorings such as “create table”, “rename column”, and “add
column” can automatically create rollback statements. If your change
log contains only statements that fit into this category, your
rollback commands will be generated automatically.
Other refactorings such as “drop table” and “insert data” have no
corresponding rollback commands that can be automatically generated.
In these cases, and cases where you want to override the default
generated rollback commands, you can specify the rollback commands via
the tag within the changeSet tag. If you do not want anything done to
undo a change in rollback mode, use an empty tag.
At http://forum.liquibase.org/topic/understanding-rollback you can find more details and other links.

Why are all contexts executed when non specified on update?

I'm using Liquibase 3.3.5 to update my database. Having contexts is a nice way to only execute specific parts of the changelog. But I don't understand, why ALL changesets are executed, when no context is provided on update. Consider the following example:
changeset A: context=test
changeset B: no context
changeset C: context=prod
So
executing update with context=test, will execute changeset A+B.
executing update with context=prod, will execute changeset B+C.
executing update with no context, will execute changeset A+B+C.
For me, this doesn't make sense at all :).
I would expect, that only changeset B will be executed, since it doesn't define a specific context.
In the Liquibase contexts example: http://www.liquibase.org/documentation/contexts.html ("Using Contexts for Test Data") they say, that one should mark the changesets for testing with "test", and executing them with giving the context "test" to apply testdata. Fine - make sense. But
"When it comes time to migrate your production database, don’t include the “test" context, and your test data not be included. "
So, if I wouldn't specify "test" context when executing production update, it would execute the "test" changesets as well, since I didn't specify a context at all.
Again, I would expect that leaving out test on update execution, would only perform the regular changesets without the test changesets.
Or I'm missing something here :)?
This is just how Liquibase works - if you do an update and don't specify a context, then all changesets are considered as applicable to that update operation.
There were a couple of ways that this could have been implemented, and the development team had to pick one.
if you don't specify a context during an update operation, then
no changesets are considered.
if you don't specify a context, then all changesets are considered.
if you don't specify a context, then only changesets that have no context are considered.
if you don't specify a context and none of the changesets have contexts on them, then all changesets are considered, but if some of the changesets do have contexts, go to option 1, 2, or 3 above.
The team could have gone with option 3 (which matches your expectation) but decided long ago to go with option 2, as that seemed like the 'best' way at the time. I wasn't on the team at that time, so I don't know any more than that.
I will add solution from me (from my perspective the default Liquibase behavior is not intuitive). In our project to deal with the "problem" we configured liquibase context in this way:
liquibase.setChangeLog("classpath*:liquibase/master.xml");
contexts = StringUtils.isBlank(contexts) ? "none" : contexts;
liquibase.setContexts(contexts);
It cause that liquibase will run all change-sets with context 'none' and all default change-sets (change-sets without context) - yes this is how it works.
So select the name which nobody on your team won't use ('none' in our case) as context name and then run the liquibase by default with that context (take a look at example). With that approach you will run the change-sets without any context what I assume should be default approach!
I just mark the development changeSets as "dev" [or "test"] and don't specify a context on the changesets that run in both. When I do an update on production, I will specify contexts=prod in the update even though there are no changesets marked as prod. That will make it skip all the dev [or "test"] context ones but will still execute all the non-context-ed changeSets. You are also then set up for some point in the future where you need to make a context="prod" changeSet that ... only runs in prod.
Source: http://forum.liquibase.org/topic/using-context-for-development-only-and-production-changesets
And what happens if you or your admin forgets to specify a context? Yes, it will execute A+B+C, on a production it can break many of things and make your life not so happy.
I'm looking for a solution that benefits in these cases and aborts the liquibase execution at the beginning (when you are running liquibase without any contexts).
It would be cool if the liquibase has a property (in liquibase.properties) to restrict running with/without contexts...
As an solution you can add contexts=default,contexts,of,your,project to the liquibase.properties file.
It might be too late for #javg, but this may benefit future readers.
This requirement could be achieved as follows:
changeset A: context=test
changeset B: context=all
changeset C: context=prod
So
executing update with "context=test,all" will execute changeset A+B.
executing update with "context=all,prod" will execute changeset B+C.
executing update with "context=all" will only execute changeset B as you expect.

How Liquibase rollback when execution failed?

If i run change log file that contain multiple change set from command line and it failed because of wrong sql at say change set 2. So change set 1 has executed and committed then how will i rollback this change using liquibase.
The easiest would be to use the rollbackCount command. Running "liquibase rollbackCount 1" will roll back the last changeSet executed using either the specified <rollback> block in the changeSet or by figuring it out if Liquibaes can based on the information in the changeSet. For example, a createTable command has the information needed to create a drop table statement, but a dropTable command does not have the information needed to do a create table so you would need to specify your own rollback block.

Does Liquibase support dry run?

We have couple of data schemas and we investigate the migration to Liquibase. (One of data schemas is already migrated to Liquibase).
Important question for us is if Liquibase supports dry run:
We need to run database changes on all schemas without commit to ensure we do not have problems.
In case of success all database changes run once again with commit.
(The question similar to this SQL Server query dry run but related to Liquibase)
Added after the answer
I read documentation related to updateSQL and it is not answers the requirements of “dry run”.
It just generates the SQL (in command line, in Ant task and in Maven plugin).
I will clarify my question:
Does Liquibase support control on transactions?
I want to open transaction before executing of Liquibase changelog, and to rollback the transaction after the changelog execution.
Of course, I need to verify the result of the execution.
Is it possible?
Added
Without control on transactions (or dry run) we can not migrate to Liquibase all our schemas.
Please help.
You can try "updateSQL" mode, it will connect db (check you access rights), acquire db lock, generate / print SQL sentences to be applied (based on db state and you current liquibase change sets) also it will print chageset id's missing in current state of db and release db lock.
Unfortunately, no.
By default, Liquibase commits the transaction executing all statements of a changeset. I assume that the migration paths you have in mind usually involve more than a single changeset.
The only way you can modify the transaction behavior is the runInTransaction attribute for the <changeset> tag, as documented here. By setting it to false, you effectively disable the transaction management, i.e. it enables auto-commit mode as you can see in ChangeSet.java.
I think that this feature could be a worthwhile addition to Liquibase, so I opened a feature request: CORE-1790.
I think your answer is "it does not support dry runs" but the problem is primarily with the database and not with liquibase.
Liquibase does run each changeSet in a transaction and commits it after inserting into the DATABASECHANGELOG table so in theory you could override liquibase logic to roll back that transaction instead of committing it, but you will run into the problem where most SQL ran by liquibase is auto-committing.
For example, if you had a changeSet of:
<changeSet>
<createTable name="test">
...
</createTable>
</changeSet>
What is ran is:
START TRANSACTION
CREATE TABLE NAME ...
INSERT INTO DATABASECHANGELOG...
COMMIT
but even if you changed the last command to ROLLBACK the create table call will auto-commit when it runs and the only thing that will actually roll back is the INSERT.
NOTE: there are some databases that will rollback DDL SQL such as postgresql, but the majority do not.
INSERT/UPDATE commands would run in a transaction and could be auto-rolled back at the end, but liquibase does not have a postCondition command to do the in-transaction check of the state that would be required. That would be a useful feature (https://liquibase.jira.com/browse/CORE-1793) but even it would not be usable if there are any auto-committing change tags in the changeset. If you added a postcondition to create table example above, the postcondition would fail and the update would fail, but the table would still be there.
If your Liquibase migration is sufficiently database agnostic, you can just run it on an in-memory H2 database (or some other "throwaway database") that you can spin up easily using a few lines of code.
var info = new Properties();
info.put("user", "sa");
info.put("password", "");
try (var con = new org.h2.Driver().connect("jdbc:h2:mem:db", info)) {
var accessor = new FileSystemResourceAccessor();
var jdbc = new JdbcConnection(con);
var database = DatabaseFactory.getInstance().findCorrectDatabaseImplementation(jdbc);
Liquibase liquibase = new Liquibase("/path/to/liquibase.xml", accessor, database);
liquibase.update("");
}
I've blogged about this approach more in detail here.

Undoing sql scripts

I have a problem to solve which requires undo operation of each executed sql file in Oracle Database.
I execute them in an xml file with MSBuild - exec command sqlplus with log in and #*.sql.
Obviously rollback won't do, because it can't rollback already commited transaction.
I have been searching for several days and still can't find the answer. What I learned is Oracle Flashback and Point in Time Recovery. The problem is that I want the changes to be undone only for the current user i.e. if another user makes some changes at the same time then my solution performs undo only on user 'X' not 'Y'.
I found the start_scn and commit_scn in flashback_transaction_query. But does it identify only one user? What if I flashback to a given SCN? Will that undo only for me or for other users as well? I have taken out
select start_scn from flashback_transaction_query WHERE logon_user='MY_USER_NAME'
and
WHERE table_name = "MY_TABLE NAME"
and performed
FLASHBACK TO SCN"here its number"
on a chosen operation's SCN. Will that work for me?
I also found out about Point in Time Recovery but as I read it makes the whole database unavailable so other users will be unable to work with it.
So I need something that will undo a whole *.sql file.
This is possible but maybe not with the tools that you use. sqlplus can rollback your transaction, you just have to make sure auto commit isn't enabled and that your scripts only contain a single commit right before you end the sqlplus session (if you don't commit at all, sqlplus will always roll back all changes when it exits).
The problems start when you have several scripts and you want, for example, to rollback a script that you ran yesterday. This is a whole new can of worms and there is no general solution that will always work (it's part of the "merge problem" group of problems, i.e. how can you merge transactions by different users when everyone can keep transactions open for as long as they like).
It can be done but you need to carefully design your database for it, the business rules must be OK with it, etc.
To general approach would be to have a table which contains the information which rows were modified (= created,updated,deleted) by the script plus the script name plus the time when it was executed.
With this information, you can generate SQL which can undo the changes created by a script. To fill such a table, use triggers or generate your scripts in such a way that they write this information as well (note: This is probably beyond a "simple" sqlplus solution; you will have to write your own data loader for this).
Ok I solved the problem by creating a DDL and DML TRIGGER. The first one takes "extra" column (which is the DDL statement you have just entered) from v$open_cursor and inserts into my table. The second gets "undo_sql" from flashback_transaction_query which is the opposite action of your DML action - if INSERT then undo_sql is DELETE with all necessary data.
Triggers work before DELETE,INSERT (DML) on specific table and ALTER,DROP,CREATE (DDL) on specific SCHEMA or VIEW.