We have couple of data schemas and we investigate the migration to Liquibase. (One of data schemas is already migrated to Liquibase).
Important question for us is if Liquibase supports dry run:
We need to run database changes on all schemas without commit to ensure we do not have problems.
In case of success all database changes run once again with commit.
(The question similar to this SQL Server query dry run but related to Liquibase)
Added after the answer
I read documentation related to updateSQL and it is not answers the requirements of “dry run”.
It just generates the SQL (in command line, in Ant task and in Maven plugin).
I will clarify my question:
Does Liquibase support control on transactions?
I want to open transaction before executing of Liquibase changelog, and to rollback the transaction after the changelog execution.
Of course, I need to verify the result of the execution.
Is it possible?
Added
Without control on transactions (or dry run) we can not migrate to Liquibase all our schemas.
Please help.
You can try "updateSQL" mode, it will connect db (check you access rights), acquire db lock, generate / print SQL sentences to be applied (based on db state and you current liquibase change sets) also it will print chageset id's missing in current state of db and release db lock.
Unfortunately, no.
By default, Liquibase commits the transaction executing all statements of a changeset. I assume that the migration paths you have in mind usually involve more than a single changeset.
The only way you can modify the transaction behavior is the runInTransaction attribute for the <changeset> tag, as documented here. By setting it to false, you effectively disable the transaction management, i.e. it enables auto-commit mode as you can see in ChangeSet.java.
I think that this feature could be a worthwhile addition to Liquibase, so I opened a feature request: CORE-1790.
I think your answer is "it does not support dry runs" but the problem is primarily with the database and not with liquibase.
Liquibase does run each changeSet in a transaction and commits it after inserting into the DATABASECHANGELOG table so in theory you could override liquibase logic to roll back that transaction instead of committing it, but you will run into the problem where most SQL ran by liquibase is auto-committing.
For example, if you had a changeSet of:
<changeSet>
<createTable name="test">
...
</createTable>
</changeSet>
What is ran is:
START TRANSACTION
CREATE TABLE NAME ...
INSERT INTO DATABASECHANGELOG...
COMMIT
but even if you changed the last command to ROLLBACK the create table call will auto-commit when it runs and the only thing that will actually roll back is the INSERT.
NOTE: there are some databases that will rollback DDL SQL such as postgresql, but the majority do not.
INSERT/UPDATE commands would run in a transaction and could be auto-rolled back at the end, but liquibase does not have a postCondition command to do the in-transaction check of the state that would be required. That would be a useful feature (https://liquibase.jira.com/browse/CORE-1793) but even it would not be usable if there are any auto-committing change tags in the changeset. If you added a postcondition to create table example above, the postcondition would fail and the update would fail, but the table would still be there.
If your Liquibase migration is sufficiently database agnostic, you can just run it on an in-memory H2 database (or some other "throwaway database") that you can spin up easily using a few lines of code.
var info = new Properties();
info.put("user", "sa");
info.put("password", "");
try (var con = new org.h2.Driver().connect("jdbc:h2:mem:db", info)) {
var accessor = new FileSystemResourceAccessor();
var jdbc = new JdbcConnection(con);
var database = DatabaseFactory.getInstance().findCorrectDatabaseImplementation(jdbc);
Liquibase liquibase = new Liquibase("/path/to/liquibase.xml", accessor, database);
liquibase.update("");
}
I've blogged about this approach more in detail here.
Related
I am doing a test that updates my database each time I run it
And I cannot do the test again with the updated values
I am recreating the WHOLE database with:
postgres=# drop database mydb;
DROP DATABASE
postgres=# CREATE DATABASE mydb WITH TEMPLATE mycleandb;
CREATE DATABASE
This takes a while
Is there any way I can update just the tables that I changed with tables from mycleandb?
Transactions
You haven't mentioned what your programming language or framework are. Many of them have built in test mechanisms that take care of this sort of thing. If you are not using one of them, what you can do is to start a transaction with each test setup. Then roll it back when you tear down the test.
BEGIN;
...
INSERT ...
SELECT ...
DELETE ...
ROLLBACK;
Rollback, as the name suggests reverses all that has been done to the database so that it remains at the original condition.
There is one small problem with this approach though, you can't do integration tests where you intentionally enter incorrect values and cause a query to fail integrity tests. If you do that the transaction ends and no new statements can be executed until rolled back.
pg_dump/pg_restore
it's possible to use the -t option of pg_dump to dump and then restore one or a few tables. This maybe the next best option when transactions are not practical.
Non Durable Settings / Ramdisk
If both above options are inapplicable please see this answer: https://stackoverflow.com/a/37221418/267540
It's on a question about django testing but there's very little django specific stuff on that. However coincidentally django's rather excellent test framework relies on the begin/update/rollback mechanis described above by default.
Test inside a transaction:
begin;
update t
set a = 1;
Check the results and then:
rollback;
It will be back to a clean state;
I'm using Liquibase 3.3.5 to update my database. Having contexts is a nice way to only execute specific parts of the changelog. But I don't understand, why ALL changesets are executed, when no context is provided on update. Consider the following example:
changeset A: context=test
changeset B: no context
changeset C: context=prod
So
executing update with context=test, will execute changeset A+B.
executing update with context=prod, will execute changeset B+C.
executing update with no context, will execute changeset A+B+C.
For me, this doesn't make sense at all :).
I would expect, that only changeset B will be executed, since it doesn't define a specific context.
In the Liquibase contexts example: http://www.liquibase.org/documentation/contexts.html ("Using Contexts for Test Data") they say, that one should mark the changesets for testing with "test", and executing them with giving the context "test" to apply testdata. Fine - make sense. But
"When it comes time to migrate your production database, don’t include the “test" context, and your test data not be included. "
So, if I wouldn't specify "test" context when executing production update, it would execute the "test" changesets as well, since I didn't specify a context at all.
Again, I would expect that leaving out test on update execution, would only perform the regular changesets without the test changesets.
Or I'm missing something here :)?
This is just how Liquibase works - if you do an update and don't specify a context, then all changesets are considered as applicable to that update operation.
There were a couple of ways that this could have been implemented, and the development team had to pick one.
if you don't specify a context during an update operation, then
no changesets are considered.
if you don't specify a context, then all changesets are considered.
if you don't specify a context, then only changesets that have no context are considered.
if you don't specify a context and none of the changesets have contexts on them, then all changesets are considered, but if some of the changesets do have contexts, go to option 1, 2, or 3 above.
The team could have gone with option 3 (which matches your expectation) but decided long ago to go with option 2, as that seemed like the 'best' way at the time. I wasn't on the team at that time, so I don't know any more than that.
I will add solution from me (from my perspective the default Liquibase behavior is not intuitive). In our project to deal with the "problem" we configured liquibase context in this way:
liquibase.setChangeLog("classpath*:liquibase/master.xml");
contexts = StringUtils.isBlank(contexts) ? "none" : contexts;
liquibase.setContexts(contexts);
It cause that liquibase will run all change-sets with context 'none' and all default change-sets (change-sets without context) - yes this is how it works.
So select the name which nobody on your team won't use ('none' in our case) as context name and then run the liquibase by default with that context (take a look at example). With that approach you will run the change-sets without any context what I assume should be default approach!
I just mark the development changeSets as "dev" [or "test"] and don't specify a context on the changesets that run in both. When I do an update on production, I will specify contexts=prod in the update even though there are no changesets marked as prod. That will make it skip all the dev [or "test"] context ones but will still execute all the non-context-ed changeSets. You are also then set up for some point in the future where you need to make a context="prod" changeSet that ... only runs in prod.
Source: http://forum.liquibase.org/topic/using-context-for-development-only-and-production-changesets
And what happens if you or your admin forgets to specify a context? Yes, it will execute A+B+C, on a production it can break many of things and make your life not so happy.
I'm looking for a solution that benefits in these cases and aborts the liquibase execution at the beginning (when you are running liquibase without any contexts).
It would be cool if the liquibase has a property (in liquibase.properties) to restrict running with/without contexts...
As an solution you can add contexts=default,contexts,of,your,project to the liquibase.properties file.
It might be too late for #javg, but this may benefit future readers.
This requirement could be achieved as follows:
changeset A: context=test
changeset B: context=all
changeset C: context=prod
So
executing update with "context=test,all" will execute changeset A+B.
executing update with "context=all,prod" will execute changeset B+C.
executing update with "context=all" will only execute changeset B as you expect.
I want to understand better how Liquibase executes change sets.
1)
a) For example I have a change log with 4 change sets and I execute updateDatabase (http://www.liquibase.org/documentation/ant/updatedatabase_ant_task.html).
Liquibase will execute 4 change sets.
b) If I run the same change log once again Liquibase not execute any set.
c) If I will add a new change set to the change log and will run the change log Liquibase will execute the new change set only.
Questions:
How Liquibase knows what change sets to execute?
How Liquibase knows what change sets already executed?
2) How change set ID is important? Can I change it after a change log execution?
3) How change set author is important? Can I change it after a change log execution?
4) What happens if I will execute the rollbackDatabase (http://www.liquibase.org/documentation/ant/rollbackdatabase_ant_task.html)?
How Liquibase knows what change sets to rollback?
a) What happens if I will execute the rollback after 1 a).
Will Liquibase call to rollback element that is located in each change sets (4 rollback elements)?
b) What happens if I will execute the rollback after 1 b).
How Liquibase will know not to call to any to rollback element?
c) What happens if I will execute the rollback after 1 c).
Will Liquibase call to rollback element of only the new change set?
I can answer a few questions, perhaps not all though.
c. - Liquibase creates 2 new tables in the database when you do the first update. The main table is DATABASECHANGELOG, and that is used to keep track of what change sets have been applied to the database. Liquibase uses a couple of ways to identify each changeset - the id, author, and path are used as a composite key. Liquibase also generates a checksum of each changeset that is used to tell whether the changeset has been altered after being applied to the database.
and 3. Because the change set id and author are used as part of the primary key, if you deploy, then change either one of those, you may run into unexpected behavior on subsequent deploys. I think that the id and author are also part of the checksum calculation, so that might affect things also. I would recommend that you do not change those after deploying.
Rollback uses the same mechanism to know what change sets to roll back. When you roll back you have to specify in some way what changes to undo - see this page for more info: http://www.liquibase.org/documentation/rollback.html
The rollback identification mechanisms are by tag (which means you have to apply tags when you deploy), by date (Liquibase keeps track of when each changeset was deployed), or by number (which implicitly uses the date/time of when each changeset was deployed).
In HSQL to change TRANSACTION CONTROL there can't be any active transactions.
Flyway, in turn, after committing migration X and before executing SQL from migration X, sets autocommitt=false and executes some of its own statements. So if the migration contains SET DATABASE TRANSACTION CONTROL statement it will wait for those uncommitted statements forever causing application to hang.
(Side note: The statements executed by flyway before migration varies from version to version e.g. in 1.7 that were pure selects so changing from LOCK to MVCC was possible but after I had MVCC any subsequent DDL statements in further migrations hanged; in flyway 2.0 it was select for update on schema_version table so any transaction control change hanged; in 2.2 select for update was changed to explicit lock with the same effect as in 2.0)
So basically it is not possible to change transaction control in flyway migrations. On the other hand flyway discourages changes outside of its migration. Any idea then how to change transaction control in with flyway/hsql?
Update
Another observation is that when database control is set to MVCC then any DDL statement in flyway migration hangs application too. So I would just set LOCKS before each migration and restore MVCC after it. Would that be clean solution from Flyway perspective?
import com.googlecode.flyway.core.util.jdbc.JdbcUtils;
public void migrate() {
setDbTransactionControl("LOCKS");
flyway.migrate();
setDbTransactionControl("MVCC");
}
private void setDbTransactionControl(String mode) {
Connection connection = null;
try {
connection = JdbcUtils.openConnection(ds);
connection.createStatement().execute("SET DATABASE TRANSACTION CONTROL " + mode);
} catch (SQLException e) {
//log it
JdbcUtils.closeConnection(connection);
} finally {
JdbcUtils.closeConnection(connection);
}
}
This is not possible inside a Flyway migration.
Before Flyway starts a migration, it opens a transaction in a separate connection to acquire a lock on its metadata table. So you will never be able to execute a statement that absolutely must be run without any other transactions.
Your best option is probably to set it on the datasource, so it can init each connection this way upon create.
Try to use the Flyway callbacks beforeMigrate and afterMigrate. Both run apart from the migration transactions. MVCC should be used for my application so the the JDBC URL contains hsqldb.tx=mvcc. I could sucessfully change the transaction model during the Flyway migration with beforeMigrate.sql SET DATABASE TRANSACTION CONTROL LOCKS; and afterMigrate.sql SET DATABASE TRANSACTION CONTROL MVCC;. There are also Java versions of the callbacks. I'm using HSQLDB 2.3.3 and Flyway 3.2.1.
I know that in MySQL ddl statements such as alter table/create table/etc cause an implicit transaction commit.
As we are moving to PostgreSQL is it possible to wrap multiple DDL statments in a transaction?
This would make migration scripts a lot more robust, a failed DDL change would cause everything to rollback.
DDL statements are covered by transactions. I can't find the relevant section in the official documentation, but have provided a link to the wiki which covers it.
Just remember that transactions aren't automatically opened in postgresql, you must start them with BEGIN or START TRANSACTION.
Postgresql Wiki about Transactional DDL
Not every Postgres DDL statement can be wrapped in transaction. Statements like DROP DATABASE / DROP TABLESPACE and some other file-system-related cant rollback.
Also:
ALTER TYPE ... ADD VALUE (the form that adds a new value to an enum
type) cannot be executed inside a transaction block.
Also some statements like TRUNCATE are 'not MVCC save'. Changes, made by that kind of statements can affect other queries, even if they are rolled back.
So - read the official manual for your version of postgres to find out if your DDL's are transaction safe.