The Setup
I am using Liquibase to manage my project's migrations.
I have several tables with several sets of seed data.
Each seeded table has a changeset to create the table, followed by a changeset to load the seed data. The seeds are being loaded using loadUpdateData. This is a smart method that will load seed data from a CSV, if the CSV content is edited it will make the appropriate edits directly.
The seed ChangeSets are in a separate ChangeLog that is always run after the core ChangeLog. This way the seed files can always reflect the correct table structure.
The Problem
I need to drop a table that has seed data. The loadUpdateData command errors because the table no longer exists by the time it is run.
The Code
Create Table ChangeSet
<changeSet author="" id="create-table-help-items">
<createTable tableName="help_items">
<column name="help_item_id"
type="bigint">
<constraints primaryKey="true"/>
</column>
<column name="title"
type="text" />
<column name="description"
type="text" />
</createTable>
<rollback>
<dropTable tableName="help_items"/>
</rollback>
</changeSet>
Seed ChangeSet
<changeSet author="" id="seed-help-items" runOnChange="true">
<loadUpdateData file="db/seeds/help_items.csv"
primaryKey="help_items_id"
tableName="help_items" />
</changeSet>
Drop Table ChangeSet
<changeSet author="" id="remove-table-help-items">
<dropTable tableName="help_items"/>
<rollback changeSetId="create-table-help-items" changeSetAuthor=""/>
</changeSet>
The Questions
Given that it is bad practice to ever delete changesets from a changelog.
What is the right way to create seed migrations so that they don't break when the table is deleted?
Do I need to keep the seed files for tables that have been dropped?
I had the same problem and now i solved it, it works already by me.
The point is that you should set the Rollback in "Drop Table ChangeSet" correctly.
So by droping a table, you should already prepare for the situation of rollback.
<changeSet author="" id="remove-table-help-items">
<dropTable tableName="help_items"/>
<rollback>
<createTable tableName="help_items"/>
</rollback>
</changeSet>
I solved this issue by setting runAlways: false for loadUpdateData changeset. So, it doesn't fail when the table is dropped.
Related
Is it possible to generate change logs, merging change sets together?
Like this:
<changeSet author="Author" id="1">
<dropNotNullConstraint columnDataType="varchar(255)" columnName="name" tableName="table"/>
<dropNotNullConstraint columnDataType="varchar(255)" columnName="phone" tableName="table"/>
<dropNotNullConstraint columnDataType="bigint" columnName="email" tableName="table"/>
</changeSet>
instead of this:
<changeSet author="Author" id="1">
<dropNotNullConstraint columnDataType="varchar(255)" columnName="name" tableName="table"/>
</changeSet>
<changeSet author="Author" id="2">
<dropNotNullConstraint columnDataType="varchar(255)" columnName="phone" tableName="table"/>
</changeSet>
<changeSet author="Author" id="3">
<dropNotNullConstraint columnDataType="bigint" columnName="email" tableName="table"/>
</changeSet>
If so, can you choose a custom scope? (e.g. all changes in a given table are grouped together, etc.). If not, is there a 'best practice' reason for avoiding grouping them manually?
If you mean command generateChangeLog then no, there is no option to put all changes in one changeLog. However you can create that programatically if you need.
Liquibase should changes per changeSet because each changeSet is executed in one transaction read here. So if you include a lot of changes inside one changeSet and execution of that changeSet fails, it would be harder for you to find on which change it failed. It's better to handle change per changeSet.
note: sometimes I'm also using multiple changes per changeSet (for example remarks (comments) for table or DML (data) modifications) but I prefer to have one DDL per changeSet.
I am trying to populate 2 tables through csv files in liquibase.
I have one table called tenant and one another named tenant_configuration and have foreign key to tenant. First part is load tenant data:
<changeSet id="1" context="test">
<comment>Insert data for tenant table</comment>
<loadUpdateData
primaryKey="id"
file="tenant.csv"
tableName="tenant"/>
</changeSet>
Then i would like to use another csv file to populate tenant config but retrieve tenant_id from first change.
<changeSet id="2" context="test">
<comment>Insert data for tenant_db_configuration table</comment>
<loadData tableName="tenant_db_configuration"
file="tenant_db_configuration.csv"
separator="," >
<column name="tenant_id" type="NUMERIC" defaultValueComputed="(SELECT ID FROM tenant WHERE tenant_id = tenant_1)"/>
<column header="username" name="username" type="STRING"/>
<column header="password" name="password" type="STRING"/>
</loadData>
</changeSet>
Tried this but liquibase ignore the tenant_id part and shows:
[Failed SQL: INSERT INTO [dbo].[tenant_db_configuration] ([username], [password]) VALUES...
how i can retrieve that foreign key and merge with existing csv file to load data?
thanks!
As far as i know you will need to make the relation manually in the secomnd CSV file, tenant_db_configuration.csv. I mean, each row in there will need to be pointing to an existent id in the tenat Table.
If the order doesnt matters bc you trust in your csv data, You could disable the foreign key checks with a changeSet before starting to import the data.
I dunno wich DBMS are you using but with MYSQL is
<changeSet author="liquibase-docs" id="sql-example">
<sql dbms="mysql">
SET FOREIGN_KEY_CHECKS=0;
</sql>
</changeSet>
and then you can enable it after on a separate changeSet.
It's been 3 months since the question, so Hope it helps. If you solved it already, what was the solution?
I’m using Liquibase via the Gradle-Liquibase (v 1.1.1) plugin. I have the following changeset …
<changeSet id="create_my_stored_proc" author="davea" dbms="mysql" runAlways="true">
<sqlFile endDelimiter="//" path="src/main/resources/scripts/create_my_stored_proc.sql" stripComments="true"/>
</changeSet>
Is it possible to set something such that checksums are ignored for this changeset only? The underlying procedure is in a state of flux that could be repeatedly updated and rather than create a new changeset each time, I would like the existing one to run upon every Liquibase build.
You can disable the checks per changeSet using the <validCheckSum> tag with known good values.
For example, if the previous changeset had a checksum like 8:b3d6a29ce3a75940858cd093501151d1 and you wanted to tweak that changeSet (but not re-apply it where this step has already succeeded) then you could use something like this:
<changeSet author="me" id="mychangeset">
<validCheckSum>8:b3d6a29ce3a75940858cd093501151d1</validCheckSum>
<sqlFile ... />
</changeSet>
"RunAlways will still throw a checksum error by default, but you can always use runOnChange=true or any to change that."
Have a look at this ticket raised in liquibase: https://liquibase.jira.com/browse/CORE-2506
So, you could do:
<changeSet id="create_my_stored_proc" author="davea" dbms="mysql" runAlways="true">
<validCheckSum>any</validCheckSum>
<sqlFile endDelimiter="//" path="src/main/resources/scripts/create_my_stored_proc.sql" stripComments="true"/>
</changeSet>
You can add the runAlways and/or runOnChange attributes.
Testers are updating data through the app
I am working with an application where the app is changing rapidly, but at the same time, the testers need to build test and certification data. This data is being created by accessing the app directly, rather than writing SQL statements.
So, I have changesets coming in from the developers and data changes being applied through the application by testers at the same time.
I have wired up liquibase to handle running the changesets written by the developers, but I'm having difficulties figuring out the cleanest way to track and preserve data changes by the testers.
Possible workflow
Based on using liquibase through the entire process, I'm thinking I need a workflow like:
Start with the latest clean database
Run liquibase update
Snapshot or tag the database for differencing later
Let the testers hack away through the app.
If the testers approve the changes, upon promotion:
generate a data diff as a changeset
include the data changeset in the master update list
on this VM, record the changeset as already ran
commit to scm
Repeat
Questions
Is there a way to use liquibase to get a true data diff, and not a full data export? It seems that generateChangeLog is the only tool in the diffs that allows setting the --diff-type="data" flag and value, but the documentation also makes that seem that it won't diff, it just dumps all of the data.
If yes,
can you provide a sample call? I have the url & referenceURL figured out and stored in a liquibase.properties file, I just need to know which command and flags to pass.
can it be used against a tag instead of having to create a backup of the database? (step 3 in the workflow)
If no,
has anyone seen a good tutorial or howto showing the orchestration between liquibase and dbUnit updates?
How do you handle the situation where the data export no longer fits the schema? For example, FullName split => FirstName and LastName; liquibase can handle this, but I would think that I would need to orchestrate running updates between liquibase and dbUnit, otherwise the diff of dbUnit will be invalid?
Any guidance, tips, past experiences or gotchas to watch out for would be greatly appreciated.
No, liquibase does not support a data diff, only a full data dump.
I have seen http://ljnelson.github.io/liquiunit/ which may help you with dbunit integration, but using dbunit or any other data load tool to manage your data will run into schema incompatibilities like you suggest.
What I would suggest doing is to have "test data load" changeSets that are added into your changeLog to build up your test data as you go along.
For example:
<changeSet id="1" author="x">
<createTable name="a"../>
</changeSet>
<changeSet id="2" author="x">
<createTable name="b"../>
</changeSet>
<changeSet id="3" author="x" context="test">
<sqlFile path="data-dump.1.sql">
</changeSet>
<changeSet id="4" author="x">
<renameColumn oldColumnName="s" newColumnName="t"../>
</changeSet>
<changeSet id="5" author="x" context="test">
<sqlFile path="data-dump.2.sql">
</changeSet>
You see it creates an initial structure and then loads a round of QA's data into the database with the database structure as it is after changeset 2. Notice the use of contexts so the test data isn't loaded into production.
After the test data are more structure changes and then another round of additional QA data. The new data doesn't re-create data-dump.1.sql but is in addition to it. Since data-dump.1.sql is always ran before changeSet 4, it doesn't have to be updated as the schema changes.
The big problem, though, is how to extract your test data as QA is building it up. If they are adding it through your application, the easiest approach may be to use something like p6spy to automatically collect all the SQL executed in your application and then just copy it into your data-dump.X.sql files.
As an alternative to my other answer, you can have a full dump of your database (use liquibase generateChangeLog diffTypes=data or your standard database backup tool) which you create from the QA-built database and move along your changelog file.
Step 1:
Create your database
<changeSet id="1" author="x">
<createTable name="a"../>
</changeSet>
<changeSet id="2" author="x">
<createTable name="b"../>
</changeSet>
Step 2
QA creates database then makes a backup which is included in the changelog file.
<changeSet id="1" author="x">
<createTable name="a"../>
</changeSet>
<changeSet id="2" author="x">
<createTable name="b"../>
</changeSet>
<changeSet id="testdata-1" author="x" context="test">
<sqlFile path="data-dump.sql">
</changeSet>
Step 3
There are schema changes that don't require new test data, keep adding to the changelog file. The test data is migrated along and matches your needed schema.
<changeSet id="1" author="x">
<createTable name="a"../>
</changeSet>
<changeSet id="2" author="x">
<createTable name="b"../>
</changeSet>
<changeSet id="testdata-1" author="x" context="test">
<sqlFile path="data-dump.sql">
</changeSet>
<changeSet id="4" author="x">
<renameColumn oldColumnName="s" newColumnName="t"../>
</changeSet>
Step 4
When QA needs additional data, they add what they want using the app then make a new full backup and move the testdata changeset to the end
<changeSet id="1" author="x">
<createTable name="a"../>
</changeSet>
<changeSet id="2" author="x">
<createTable name="b"../>
</changeSet>
<changeSet id="4" author="x">
<renameColumn oldColumnName="s" newColumnName="t"../>
</changeSet>
<changeSet id="testdata-1" author="x" context="test">
<sqlFile path="data-dump.sql">
</changeSet>
This process does require you to drop your dev/test databases to get the new test data or you will get insert conflicts, but depending on your workflow it may work for you.
I am looking to compress a column through Liquibase and I haven't been able to find any examples of this on the Liquidbase site.
I was wondering if anyone has an example of this?
You can add custom SQL statements to Liquibase change logs using the <sql> element and use the dbms attribute on change sets to define for which databases they are meant to be run.
<changeSet id=".." dbms="oracle">
<sql>
alter table foobar move compress;
</sql>
<rollback>
<sql>
alter table foobar nocompress;
</sql>
</rollback>
</changeSet>
You can use modifyDataType
<changeSet author="liquibase-docs" id="modifyDataType-example">
<modifyDataType catalogName="cat"
columnName="id"
newDataType="A String"
schemaName="public"
tableName="person"/>
</changeSet>