Can Liquibase detect if it has already run? - liquibase

I have a small set of scripts that manage the build/test/deployment of an app. Recently I decided I wanted to switch to Liquibase for db schema management. This script will be working both on the developer machines where it regularly blow away and rebuild their database and also on deployed environment where we will only be adding new changesets.
When this program first runs on a deployed environment I need to detect if Liquibase has run or not and then run changelogSync to sync with the existing tables.
Other than manually checking if the database changelog table exists is there a way for the Liquibase API to let me know that it has already run at least once?
I'm using the Java core library in Groovy

The easiest way is probably ((StandardChangeLogHistoryService) ChangeLogHistoryServiceFactory.getInstance().getChangeLogService(database)).hasDatabaseChangeLogTable()
The ChangeLogHistoryService interface returned by liquibase.changelog.ChangeLogHistoryServiceFactory doesn't have a method to check if the table exists, but the StandardChangeLogHistoryService implementation does.

Related

How to automate source control with Oracle database

I work in an Oracle instance that has hundreds of schemas and multiple developers. We have a development instance where developers can integrate their work before test or production.
We want to have source control for all the DDL run in this integrated development database. Currently this is done through a product Red Gate which we run manually after we make a change to the database. Redgate finds the changes between what is in the schema and what was last checked into source control and makes a script of the differences and puts this into source control.
The problem however is of course that running regdate can take some time and people run it infrequently or not at all for small changes. Also redgate will only look in one schema at a time and it would be VERY time consuming to manually run it against all schemas to guarantee that they are up to date. However if the source controlled code cannot be relied upon it becomes less useful...
What would seem to be ideal would be to have some software that could periodically (even once a day), or when triggered by DDL being run, update the source control (preferably github as this is used by other teams) from all the schemas.
I cannot seem to see any existing software which can be simply used to do this.
Is there a problem with doing this? (there is no need to address multiple developers overwriting each others work on the same day as we have this covered in a separate process) Is anyone doing this? Can anyone recommend a way to do this?
We do this with help of a PL/SQL function, a python script and a shell script:
The PL/SQL function can generate the DDL of a whole schema and returns this as CLOB
The python script connects to the database, fetches the DDL and stores it in files
The shell script runs the Source Control to add the modifications (we use Bazaar here).
You can see the scripts on PasteBin:
The PL/SQL function is here: http://pastebin.com/AG2Fa9zL
The python program (schema_exporter.py): http://pastebin.com/nd8Lf0gK
The shell script:
The shell script:
python schema_exporter.py
d=$(date +%Y-%m-%d__%H_%M_%S)
bzr add
bzr st | grep -q -E 'added|modified' && commit -m "Database objects on $d"
exit 0
This shell script is configured to run from cron every day.
Being in the database version control space for 5 years (as director of product management at DBmaestro) and having worked as a DBA for over two decades, I can tell you the simple fact that you cannot treat the database objects as you treat your Java, C# or other files and save the changes in simple DDL scripts.
There are many reasons and I'll name a few:
Files are stored locally on the developer’s PC and the change s/he
makes do not affect other developers. Likewise, the developer is not
affected by changes made by her colleague. In database this is
(usually) not the case and developers share the same database
environment, so any change that were committed to the database affect
others.
Publishing code changes is done using the Check-In / Submit Changes /
etc. (depending on which source control tool you use). At that point,
the code from the local directory of the developer is inserted into
the source control repository. Developer who wants to get the latest
code need to request it from the source control tool. In database the
change already exists and impacts other data even if it was not
checked-in into the repository.
During the file check-in, the source control tool performs a conflict
check to see if the same file was modified and checked-in by another
developer during the time you modified your local copy. Again there
is no check for this in the database. If you alter a procedure from
your local PC and at the same time I modify the same procedure with
code form my local PC then we override each other’s changes.
The build process of code is done by getting the label / latest
version of the code to an empty directory and then perform a build –
compile. The output are binaries in which we copy & replace the
existing. We don't care what was before. In database we cannot
recreate the database as we need to maintain the data! Also the
deployment executes SQL scripts which were generated in the build
process.
When executing the SQL scripts (with the DDL, DCL, DML (for static
content) commands) you assume the current structure of the
environment match the structure when you create the scripts. If not,
then your scripts can fail as you are trying to add new column which
already exists.
Treating SQL scripts as code and manually generating them will cause
syntax errors, database dependencies errors, scripts that are not
reusable which complicate the task of developing, maintaining,
testing those scripts. In addition, those scripts may run on an
environment which is different from the one you though it would run
on.
Sometimes the script in the version control repository does not match
the structure of the object that was tested and then errors will
happen in production!
There are many more, but I think you got the picture.
What I found that works is the following:
Use an enforced version control system that enforces
check-out/check-in operations on the database objects. This will
make sure the version control repository matches the code that was
checked-in as it reads the metadata of the object in the check-in
operation and not as a separated step done manually. This also allow
several developers to work in parallel on the same database while
preventing them to accidently override each other code.
Use an impact analysis that utilize baselines as part of the
comparison to identify conflicts and identify if a difference (when
comparing the object's structure between the source control
repository and the database) is a real change that origin from
development or a difference that was origin from a different path and
then it should be skipped, such as different branch or an emergency
fix.
Use a solution that knows how to perform Impact Analysis for many
schemas at once, using UI or using API in order to eventually
automate the build & deploy process.
An article I wrote on this was published here, you are welcome to read it.
To me it seems like your way of working is backwards: developers run DDL against the DB in an unordered fashion and then you need an automated tool for inferring the changes (and the DDL) that was run.
The process would be in better control if you did the following instead:
Developers write DDL as SQL scripts, preferably using a migration tool such as Flyway (http://flywaydb.org/documentation/migration/sql.html).
Migration scripts are checked into version control
Migration scripts are periodically run against the DB (e.g. by the migration tool)
In this workflow, the DB would only get altered through automated migration scripts and no-one is allowed to do changes manually. Could this work for you?
(I develop the Oracle tools for Redgate)
Actually using the tools you can already what I think you're asking for using Schema Compare for Oracle.
You can compare multiple schemas either in the UI or via the command line - I think what you're after is automating the command line tool which can create difference scripts, sync between source and destination (live, snapshot or scripts) and generate reports.
You can automate the command line to sync to a scripts folder which is your source code checkout and then subsequently run a command to commit the changes.
I think that's all good :)
We built a commerical tool that bridges Oracle with Git. It helps you manage your database objects with Git. Basically, the database becomes the working directory for the developer. You can perform git operations in the database such as reset, commit, branch, merge etc... and the database code is updated automatically. It might be worth taking a look: https://www.gitora.com

Using Liquibase and cherry-picking changesets

So I want to use liquibase as a replacement for SQL scripts for preparing databases in different development environment (SIT->UAT->PROD). The plan is to execute the liquibase update (with some other parameters in place if necessary), before start doing the testing.
The caveat is, all files (including liquibase XML) that are to be submitted to UAT and PROD must have been frozen; i.e. there could be no change for any files that has successfully passed SIT. Is there any way I can do this, so that in UAT I can only execute changesets which have successfully passed SIT (and similarly, in PROD I can only execute changesets which have successfully passed UAT), without actually altering the XML file on liquibase?
Thanks.
UPDATE
There are several issues which are inherent inside the current development cycle:
It would be redundant to ask the developers to run SIT again, this time with context=SIT being put inside.
Developer(s) only wanting to test their own changesets in the UAT. So a developer is only responsible for his own changesets; meaning they don't want to run others' changesets, even if these changesets have successfully passed SIT. Same issue also applies for UAT -> PROD.
Sorry I was not clear on this issue beforehand. I was tasked to implement Liquibase on my current workplace, and I don't have a really good picture of what's really happening in the cycle.
Liquibase does not allow you to pick certain changeSets to execute. The main reason for this is because the order that changes run against a database can make a big difference. Normally it doesn't help to have developers run just their changeSets because the database changes created by others are still needed by the application.
I think the most common way to handle your scenario is to rely on the same version control practices you use for your codebase. Liquibase is designed as a simple text format so that the changelog files can be stored in version control along with your code. Then, you can have branches for UAT and PROD and you can control what is going into those branches, including what changeSets are in the changelogs.
I think the best option wouuld be to use contexts (http://www.liquibase.org/documentation/contexts.html). ChangeSets that have passed SIT can be marked as context="sit". Then when you update UAT and PROD run with context=sit and only the tagged changesets will execute.
I think that based on your valid requirements that all scripts have to be frozen in the file-based version control which is external to Liquibase you have a major challenge here.
Liquibase cannot guarantee that files are frozen - it is not for Liquibase to know that.
You are welcome to review DBmaestro Teamwork, which enables you an enforced version control on your database objects (which guarantees that the repository and workspace database are in sync). Also generating the delta scripts handles all the merges (between different environments, UAT critical fixes, branches) from changes not originating from the development env.
Disclaimer -I'm working at DBmaestro

Change Activiti Diagram

I created a simple model and I started many processes, they are waiting the approval. While they are waiting if I update my diagram, what happens to these processes? And how can I update the diagram? I tried edit model and saved, but it didn't change.
Every definition has version. All process instances running based on definition with old version. You can migrate all runnig instances to new version of definition by org.activiti.engine.impl.cmd.SetProcessDefinitionVersionCmd.
http://forums.activiti.org/content/migrating-process-instances-newer-versions
But be careful
This command will NOT perform any migration magic and simply set the process definition version in the database, assuming that the user knows, what he or she is doing.

Can you control Liquibase updateSQL output by major release?

We use liquibase to generate database changes but need these scripted into SQL files because our DB server has neither Liquibase nor even a JVM installed.
We use the updateSQL command to create the DDL script needed to make changes, however if we run (on our development server) a 'dropAll' first we get a change set for every release.
Is there a way to run Liquibase regular 'update' for one collection of changesets (i.e. all prior releases) and then produce updateSQL output only for the last release? Essentially can we parameterize our build process to indicate the release we are targeting and automatically produce only SQL for that release?
Thanks
We have a similar setup.
The way we handle this is we have an "integration" db that always keeps the last official release.
When we have a new release candidate we let liquibase run (updateSQL) against that integration DB. Since it is on the last (resp. the current) release, updateSQL will only write out the difference between the new release candidate and the last release.
So you have a delta ddl that needs to be applied to go from release x to y.
Once the release candidate is released we let liquibase also update the integration db.
There are several ways to run a portion of your changelog.
If you break up your changelog files using versions (changelog-1.0.xml, changelog-1.1.xml, changelog-2.0.xml) and then have a master changelog.xml that <include>'s them, then the easiest solution for you may be to just run updateSql passing in the changelog version you want. If you want to generate sql for the entire database, run "liquibase --changelogFile=master.changelog.xml updateSql". If you want to generate sql for just version 2.1, run "liquibase --changeLogFile=changelog-2.1.xml updateSql"
#Jens answer works well and is what I often suggest. It has the advantage compared to the above option of catching new changeSets that were introduced into old changelog versions through a merged in patch release, for example.
Beyond that, you could use context or preconditions to dynamically control what is ran. Depending on your setup there may be ways to get those to work well for you.
Finally, there is always the extension system where you can write custom logic around how changelogs are parsed and executed to pull out older version changeSets based on file name or some other mechanism that works for you.

How can I include database changes (DDL patches, one-time data inserts, etc) in my build process?

I run my build script and then I have to remember which of the database SQL and PL/SQL scripts to run each time I deploy my application. How can I include these patches in my build script? Or does everybody just run them manually? Currently I number my patches so I know the order to run them, but sometimes I have to check SVN history to know what number to start at.
I'm using PHP but can use Java in my solution to this problem.
liquibase might solve this problem for you which integrates into ant or maven but can be started from cmd line too
You should be saving your changes as scripts and putting them in source control like the rest of your code. Then you know what changes belong to what build and need to be promoted to prod.
Since you're using PHP phing/DbDeployTask would be a smart choice. For every DB table you will have a start file and a number of patches, e.g.:
001 user.sql
002 project.sql
501 user-AddColumnAvatar.sql
etc.