load data from csv file with foreign key liquibase - liquibase

I am trying to populate 2 tables through csv files in liquibase.
I have one table called tenant and one another named tenant_configuration and have foreign key to tenant. First part is load tenant data:
<changeSet id="1" context="test">
<comment>Insert data for tenant table</comment>
<loadUpdateData
primaryKey="id"
file="tenant.csv"
tableName="tenant"/>
</changeSet>
Then i would like to use another csv file to populate tenant config but retrieve tenant_id from first change.
<changeSet id="2" context="test">
<comment>Insert data for tenant_db_configuration table</comment>
<loadData tableName="tenant_db_configuration"
file="tenant_db_configuration.csv"
separator="," >
<column name="tenant_id" type="NUMERIC" defaultValueComputed="(SELECT ID FROM tenant WHERE tenant_id = tenant_1)"/>
<column header="username" name="username" type="STRING"/>
<column header="password" name="password" type="STRING"/>
</loadData>
</changeSet>
Tried this but liquibase ignore the tenant_id part and shows:
[Failed SQL: INSERT INTO [dbo].[tenant_db_configuration] ([username], [password]) VALUES...
how i can retrieve that foreign key and merge with existing csv file to load data?
thanks!

As far as i know you will need to make the relation manually in the secomnd CSV file, tenant_db_configuration.csv. I mean, each row in there will need to be pointing to an existent id in the tenat Table.
If the order doesnt matters bc you trust in your csv data, You could disable the foreign key checks with a changeSet before starting to import the data.
I dunno wich DBMS are you using but with MYSQL is
<changeSet author="liquibase-docs" id="sql-example">
<sql dbms="mysql">
SET FOREIGN_KEY_CHECKS=0;
</sql>
</changeSet>
and then you can enable it after on a separate changeSet.
It's been 3 months since the question, so Hope it helps. If you solved it already, what was the solution?

Related

Dropping a table with a loadUpdateData changeset in Liquibase

The Setup
I am using Liquibase to manage my project's migrations.
I have several tables with several sets of seed data.
Each seeded table has a changeset to create the table, followed by a changeset to load the seed data. The seeds are being loaded using loadUpdateData. This is a smart method that will load seed data from a CSV, if the CSV content is edited it will make the appropriate edits directly.
The seed ChangeSets are in a separate ChangeLog that is always run after the core ChangeLog. This way the seed files can always reflect the correct table structure.
The Problem
I need to drop a table that has seed data. The loadUpdateData command errors because the table no longer exists by the time it is run.
The Code
Create Table ChangeSet
<changeSet author="" id="create-table-help-items">
<createTable tableName="help_items">
<column name="help_item_id"
type="bigint">
<constraints primaryKey="true"/>
</column>
<column name="title"
type="text" />
<column name="description"
type="text" />
</createTable>
<rollback>
<dropTable tableName="help_items"/>
</rollback>
</changeSet>
Seed ChangeSet
<changeSet author="" id="seed-help-items" runOnChange="true">
<loadUpdateData file="db/seeds/help_items.csv"
primaryKey="help_items_id"
tableName="help_items" />
</changeSet>
Drop Table ChangeSet
<changeSet author="" id="remove-table-help-items">
<dropTable tableName="help_items"/>
<rollback changeSetId="create-table-help-items" changeSetAuthor=""/>
</changeSet>
The Questions
Given that it is bad practice to ever delete changesets from a changelog.
What is the right way to create seed migrations so that they don't break when the table is deleted?
Do I need to keep the seed files for tables that have been dropped?
I had the same problem and now i solved it, it works already by me.
The point is that you should set the Rollback in "Drop Table ChangeSet" correctly.
So by droping a table, you should already prepare for the situation of rollback.
<changeSet author="" id="remove-table-help-items">
<dropTable tableName="help_items"/>
<rollback>
<createTable tableName="help_items"/>
</rollback>
</changeSet>
I solved this issue by setting runAlways: false for loadUpdateData changeset. So, it doesn't fail when the table is dropped.

Adding 'CONSTRAINT ensure_json CHECK (po_document IS JSON))' in liquibase

I want to add check constraint to BLOB type column which stores JSON data, in CREATE Table script in liquibase(version 3.3.5, database -Oracle 12C). but it does not compile. Can anyone please explain what is the right syntax to add constraint which ensures only JSON type data would be inserted. I followed this question
Plain sql : CONSTRAINT ensure_json CHECK (po_document IS JSON))
But not sure what is liquibase equivalent for this.
PostgreSQL Check Constraint in Liquibase
<changeSet id="Change_id" author="xqz">
<createTable tableName="table_name">
<column name="pkey" type="int">
<constraints primaryKey="true"/>
</column>
<column name="table2_pkey" type="int">
<constraints nullable="false"/>
</column>
<column name="name" type="varchar(100)">
<constraints nullable="false"/>
</column>
<column name="filters" type="BLOB">
<constraints checkConstraint="ensure_json CHECK (filters IS JSON)" />
</column>
</createTable>
</changeSet>
If I add constraint to filters column, build fails, If I remove it, build is successful. What am I doing wrong. I could not find syntax for it in liquibase docs.
You cannot define check constraints in liquibase, for aditional information see this forum entry.
You'll have to use an <sql> tag like
<sql dbms=oracle>
CREATE TABLE table_name (
pkey integer PRIMARY KEY,
table2_pkey integer NOT NULL,
name varchar(100) NOT NULL,
flter blob CONSTRAINT ensure_json CHECK (filters IS JSON)
)
</sql>
Should work just the same, except that you have to add your own <rollback> tag.

How to insert using a sequence with Liquibase

I would like to do a liquibase insert with the primary key being auto generated from the sequence defined in the database. The target database is HSQLDB.
It works to do an insert specifying a value for the primary key
<insert ...>
<column name="TAG_ID" valueNumeric="2"/>
I found this (admittedly older) conversation about it but the issue is still the same. The suggested fix doesn't work for HSQLDB.
Looking at the docs I've tried some things like
<column name="TAG_ID" defaultValueSequenceNext="TAG_ID_SEQ" />
<column name="TAG_ID" defaultValueSequenceNext="TAG_ID_SEQ.NEXTVAL" />
<column name="TAG_ID" valueComputed="TAG_ID_SEQ.NEXTVAL" />
<column name="TAG_ID" autoIncrement="true" />
but none of those put anything in the key when I do the insert (the insert fails on a null primary key).
How does one accomplish this?
HSQLDB has a setting to use Oracle syntax. You can set HSQLDB to use oracle syntax like so:
<changeSet ...
<sql dbms="hsqldb" >SET DATABASE SQL SYNTAX ORA TRUE</sql>
</changeSet>
After that, it works to do the insert like this:
<insert ...
<column name="TAG_ID" valueComputed="TAG_ID_SEQ.NEXTVAL"/>

Create a data migration strategy for testers working through the app

Testers are updating data through the app
I am working with an application where the app is changing rapidly, but at the same time, the testers need to build test and certification data. This data is being created by accessing the app directly, rather than writing SQL statements.
So, I have changesets coming in from the developers and data changes being applied through the application by testers at the same time.
I have wired up liquibase to handle running the changesets written by the developers, but I'm having difficulties figuring out the cleanest way to track and preserve data changes by the testers.
Possible workflow
Based on using liquibase through the entire process, I'm thinking I need a workflow like:
Start with the latest clean database
Run liquibase update
Snapshot or tag the database for differencing later
Let the testers hack away through the app.
If the testers approve the changes, upon promotion:
generate a data diff as a changeset
include the data changeset in the master update list
on this VM, record the changeset as already ran
commit to scm
Repeat
Questions
Is there a way to use liquibase to get a true data diff, and not a full data export? It seems that generateChangeLog is the only tool in the diffs that allows setting the --diff-type="data" flag and value, but the documentation also makes that seem that it won't diff, it just dumps all of the data.
If yes,
can you provide a sample call? I have the url & referenceURL figured out and stored in a liquibase.properties file, I just need to know which command and flags to pass.
can it be used against a tag instead of having to create a backup of the database? (step 3 in the workflow)
If no,
has anyone seen a good tutorial or howto showing the orchestration between liquibase and dbUnit updates?
How do you handle the situation where the data export no longer fits the schema? For example, FullName split => FirstName and LastName; liquibase can handle this, but I would think that I would need to orchestrate running updates between liquibase and dbUnit, otherwise the diff of dbUnit will be invalid?
Any guidance, tips, past experiences or gotchas to watch out for would be greatly appreciated.
No, liquibase does not support a data diff, only a full data dump.
I have seen http://ljnelson.github.io/liquiunit/ which may help you with dbunit integration, but using dbunit or any other data load tool to manage your data will run into schema incompatibilities like you suggest.
What I would suggest doing is to have "test data load" changeSets that are added into your changeLog to build up your test data as you go along.
For example:
<changeSet id="1" author="x">
<createTable name="a"../>
</changeSet>
<changeSet id="2" author="x">
<createTable name="b"../>
</changeSet>
<changeSet id="3" author="x" context="test">
<sqlFile path="data-dump.1.sql">
</changeSet>
<changeSet id="4" author="x">
<renameColumn oldColumnName="s" newColumnName="t"../>
</changeSet>
<changeSet id="5" author="x" context="test">
<sqlFile path="data-dump.2.sql">
</changeSet>
You see it creates an initial structure and then loads a round of QA's data into the database with the database structure as it is after changeset 2. Notice the use of contexts so the test data isn't loaded into production.
After the test data are more structure changes and then another round of additional QA data. The new data doesn't re-create data-dump.1.sql but is in addition to it. Since data-dump.1.sql is always ran before changeSet 4, it doesn't have to be updated as the schema changes.
The big problem, though, is how to extract your test data as QA is building it up. If they are adding it through your application, the easiest approach may be to use something like p6spy to automatically collect all the SQL executed in your application and then just copy it into your data-dump.X.sql files.
As an alternative to my other answer, you can have a full dump of your database (use liquibase generateChangeLog diffTypes=data or your standard database backup tool) which you create from the QA-built database and move along your changelog file.
Step 1:
Create your database
<changeSet id="1" author="x">
<createTable name="a"../>
</changeSet>
<changeSet id="2" author="x">
<createTable name="b"../>
</changeSet>
Step 2
QA creates database then makes a backup which is included in the changelog file.
<changeSet id="1" author="x">
<createTable name="a"../>
</changeSet>
<changeSet id="2" author="x">
<createTable name="b"../>
</changeSet>
<changeSet id="testdata-1" author="x" context="test">
<sqlFile path="data-dump.sql">
</changeSet>
Step 3
There are schema changes that don't require new test data, keep adding to the changelog file. The test data is migrated along and matches your needed schema.
<changeSet id="1" author="x">
<createTable name="a"../>
</changeSet>
<changeSet id="2" author="x">
<createTable name="b"../>
</changeSet>
<changeSet id="testdata-1" author="x" context="test">
<sqlFile path="data-dump.sql">
</changeSet>
<changeSet id="4" author="x">
<renameColumn oldColumnName="s" newColumnName="t"../>
</changeSet>
Step 4
When QA needs additional data, they add what they want using the app then make a new full backup and move the testdata changeset to the end
<changeSet id="1" author="x">
<createTable name="a"../>
</changeSet>
<changeSet id="2" author="x">
<createTable name="b"../>
</changeSet>
<changeSet id="4" author="x">
<renameColumn oldColumnName="s" newColumnName="t"../>
</changeSet>
<changeSet id="testdata-1" author="x" context="test">
<sqlFile path="data-dump.sql">
</changeSet>
This process does require you to drop your dev/test databases to get the new test data or you will get insert conflicts, but depending on your workflow it may work for you.

When creating a column using liquibase, how do I specify a value for that column based on an existing column?

I have an existing mysql table with two columns a and b.
I now want to add a column c to that table.
c should be nullable, should have a default value of NULL, except in those rows where column b has the value 10. Where b has the value 10, c should have a value X.
I understand that it is fairly simple to do this using SQL, but I want to do this using liquibase, since liquibase is what we use for our schema migrations.
Have you already tried something like this?
<addColumn tableName="SGW_PRODOTTI_INFO_ATTRIBUTE">
<column name="AlternativeListPrice" type="double" defaultValue="0.0">
<constraints nullable="true"/>
</column>
</addColumn>
I think the best solution without using plain sql is following:
Use addColumn change, as Walter wrote;
Use update change.
You can choose to use both changes within a changeset, but a good practice is to separate each one by a separated changeset for liquibase transaction/rollback purposes.
If you are adding column then
<changeSet author="your-name" id="your-id">
<addColumn tableName="person" >
<column name="is_active" type="varchar2(1)" defaultValue="Y" />
</addColumn>
</changeSet>
add-column
if column is already added, and then you need to set default value
<changeSet author="your-name" id="your-id">
<addDefaultValue columnDataType="varchar2(1)" columnName="is_active" defaultValue="Y" tableName="person"/>
</changeSet>
add-default-value