Liquibase changelog without MD5SUM - liquibase

I have a database here which is updated using Liquibase. If I understand correctly, Liquibase will apply a changeset and write a line to the DATABASECHANGELOG table with the execution date and an MD5 checksum. This way, Liquibase can find out when a changeset (unexpectedly) changes.
However, in this database, many (most) of the MD5SUM entries are NULL. I have no idea why that should be the case. Is this in any way a normal mode of operation?
When using the Liquibase status command, I see many ‘unapplied’ changes. How can Liquibase determine that without the MD5 sum? Or are changesets without MD5 sums considered changed.

The DATABASECHANGELOG in question was used by two different programs. This may be questionable, but used to work quite well.
Until… the programs started to use different versions of Liquibase with different formats of the MD5SUM. Upon detecting this, Liquibase writes NULL to the whole column (which is questionable in itself), thereby removing all the checksums for all the other programs. Depending on the order of Liquibase runs, different sets of checksums remained NULL in the end.

Related

Using H2 1.4 database can I write new rows if reading other rows

Using H2 1.4 database can I write new rows if reading other rows?
i.e if have 1000 rows in table, and have a SELECT query running that is getting primary key 1-10 would it be possible for an INSERT query to insert some new rows at same time, or would it have to wait for (all) the SELECT query on that table to finish?
What is the situation with an UPDATE of rows in table table but not being retrieved by any SELECT query?
I ask because with H2 1.3 I noticed that my application threads that accessed database seemed to spend a lot of time blocking, it seems better now I have upgraded to 1.4. But in my application that is multithreaded the threads are always dealing with different rows so it is important for me to better understanding how locking works in H2 (with the MV store, was previously using PAGE store with 1.3), and whether H2 can just lock individual rows when UPDATING or if it has to lock whole table.
It depends on storage engine that you choose. All information below applies to the most recent version (1.4.199), old versions have some differences.
With default MVStore engine data modification operations and SELECT … FOR UPDATE lock modified (or selected) rows. Other transactions can't modify locked rows in parallel, but can read their values. Note that read committed isolation level is used by default and other isolation levels are not really supported by this engine. With read committed isolation level other transactions will not see the concurrently modified values, they will see old ones. New values will be visible only when that transaction commits its work. With this engine database runs in multi-threaded mode by default, so a long-running command will not block other sessions.
With legacy PageStore engine (add ;MV_STORE=FALSE to the connection URL if you want to create a database with this engine) the whole tables are locked for writing. It means that you really need to lock the tables in the same order (alphabetical or some other) in all your transactions, otherwise a deadlock is possible. With this engine database runs in single-threaded mode by default, you can enable multi-threaded mode explicitly, but it is not safe with this engine. Different sessions can't do their work concurrently, long-running command will block all other sessions.
Databases are not converted from old (PageStore) format to a new (MVStore) format when you open them with a new version of H2, you have to do it by yourself. Also old databases may have serious problems with new versions, it's recommended to export them to SQL with old version of H2 using the SCRIPT TO 'filename.sql' command and load this script into new database with a new version of H2 using the RUNSCRIPT FROM 'filename.sql' command. You need to do it even if you choose to use the old engine. If you have persistent databases don't forget to create regular backup copies (with BACKUP TO 'filename.zip' command, for example).
You can find more details in the documentation:
https://h2database.com/html/advanced.html#mvcc
https://h2database.com/html/features.html#multiple_connections

Why does git interpret sql files as binaries during a merge conflict?

I got the problem with resolving merge conflicts within sql files.
MenkeTTA#909086 MINGW64 //FILE0019 (master)
$ git pull
remote: Microsoft (R) Visual Studio (R) Team Services
remote: Found 5 objects to send. (5 ms)
Unpacking objects: 100% (5/5), done.
From https://***
d58a69b..4830c58 master -> origin/master
warning: Cannot merge binary files: example_StoredProcedure.sql (HEAD vs. 4830c5886d3e1eac5ac76d1d49496afb43f444c3)
Auto-merging WRR - example_StoredProcedure.sql
CONFLICT (content): Merge conflict in example_StoredProcedure.sql
Automatic merge failed; fix conflicts and then commit the result.
When the merge conflict is created git isn't creating a pre-merged file with the competing changes as in the usual structure:
/SQL-File/
<<<<<<< HEAD
competing change A
=======
competing change B
>>>>>>> branch-a
Git is treating both files as binaries – but only for the merge-conflict operations (normal merge without conflict works properly). I can choose my own version of the file or the pulled competing file from the remote as the new head for the next push.
I reproduced this conflict with a normal .txt file. Git is treating the merge conflict then as expected with creating one pre-merged file with both competing changes/commits where I can manually fix the code how I want to.
To make git recognize the sql files as text I added
.sql diff
to the .gitattributes file like it's described here. Does anyone know how I can make git to create a ordinary pre-merged file with both versions of the competing commits when working with sql files?
First, a quick note:
To make git recognize the sql files as text I added
.sql diff
to the .gitattributes file ...
The .gitattributes line should read *.sql diff (I've fixed the linked answer, which is on a question about getting git diff to treat the file as text). However, if the file really is text, you may want, or even need, *.sql text. Note: this will not help at all if the file is not text. If the file's content is UTF-16, it is not text to Git, at least.
Consider marking the file as example_StoredProcedure.sql text, i.e., not all .sql files, just this one particular file. I'm also curious to see whether just marking it diff suffices! Update, Nov 2019: apparently marking the file as diff is not sufficient, though I have not verified this myself.
(The difference is that the diff attribute tells Git how to show the file in git diff output,1 while the text tells Git that instead of using its built in guessing algorithm, it should, for all purposes, use the setting to decide whether the file is text. The guessing algorithm consists of scanning an initial chunk of the file's contents to see how many "text-like" characters there are vs "non-text-like" characters. Probably there should be a special allowance for UTF-8 Byte Order Markers at the top, but there isn't. Curiously, during filtering, there are explicit checks.)
1Well, it's actually more involved than just showing, but I think this is a good way to start thinking about the issues. Note that you can augment the diff setting with a driver. It's not clear to me how the low level file merge interacts with a diff driver and I do not have time to experiment with it right now.
Longer explanation
warning: Cannot merge binary files: example_StoredProcedure.sql (... vs ...)
tells us that you are correct, that Git is treating the three versions of example_StoredProcedure.sql as binary. (I see you added this output after the initial question; good thing, since it's the key!)
But why did I say three versions, when the line goes on to say:
HEAD vs. 4830c5886d3e1eac5ac76d1d49496afb43f444c3
Git is being a little lazy here: all merges involve three inputs, not just two. One of these is the one you supply explicitly—or, as in this case, git pull ran git merge and git pull itself supplied the big ugly hash ID 4830c5886d3e1eac5ac76d1d49496afb43f444c3.
The second input to a merge is always the current commit, aka HEAD. You normally get this by being on the branch in the first place: HEAD names the branch-name, the branch-name identifies the commit, and this is where you want the final merge commit to go, so it all fits together.
The third input—or internally, first; internally the "theirs" version is the third input—is one that Git computes for you, based on the HEAD and other or --theirs commits: Git walks through enough of the commit graph to find the best common ancestor commit.1 It's this common ancestor commit that determines which files need merging, and if a file does need merging, the built in merge driver needs to use diffs to get textual changes to merge. For both this and for git diff, Git has a differencing engine built in to it (modified from LibXDiff).
Hence Git can, in effect, run:
git diff --find-renames <merge base commit> HEAD
to see what we did to each of our files, and:
git diff --find-renames <merge base commit> <other commit>
to see what they did to each of our files. Then:
If we changed a file and they did not touch it at all, the merge is easy: take ours.
If they changed a file and we did not touch it at all, the merge is easy: take theirs.
If we both changed a file but made the new file exactly the same, the merge is easy: take either one (ours, really, since it's in place).
Otherwise, attempt to combine the changes.
For speed reasons, Git uses the hash IDs ("blob" hashes, for the file's content) to accomplish the first three bullet points without ever having to fire up the file-level diff. This can, and does, merge unconflicted binary-file changes. It's only the final stage, where all three blob hashes differ, that requires a textual diff so as to combine changes.
Obviously, if Git can't diff the file, it cannot merge the two diff outputs. But does just marking the file as text-diff-able (pattern diff in .gitattributes) make the merge proceed? What happens if you set a diff driver, does the low-level file merge code use that driver? It "wants" to use the xdiff internal interface to find hunks; that's a lot easier than interpreting text output from a driver; you probably have to define a merge driver to get a detected-as-binary file to be merged, even if you have marked it as diff.
Additional note, Nov 2019: Since Git 2.18, Git has the ability to convert between committed UTF-8 data and in-work-tree other-format data. To use this, set the working-tree-encoding attribute. For instance, [the gitattributes documentation] shows an example line:
*.ps1 text working-tree-encoding=UTF-16LE eol=CRLF
that would keep all *.ps1 files in UTF-8 internally (in the frozen, committed files inside each commit) but keep the useful-format versions of those files in your work-tree in UTF-16-LE. I have no data as to whether this would work with these SQL files.
1In all cases, but especially in problem cases where there's more than one best common ancestor, git merge's behavior actually depends on the strategy you chose. The usual recursive strategy will merge the merge bases, commit the result, and then use that commit as the merge base! Other merge strategies work differently.

Generate changeLog for single sql files with liquibase

I have a huge database with many SQL files. Is there any way to generate a changelog for single SQL files and not for the whole database? I have stored some SQL files local on my hard drive and use liquibase via command line. If there is no way to do that with local SQL files, is there a way to generate a changelog for single tables of my database?
What you are looking for is not possible. A database does not remember the SQL that was executed to get the database into a certain state. Here is a real simple example. Say that you first run some SQL to create a table with two columns 'name' and 'id'. Then you run some more sql to add a third column 'active'. The database does not remember that two separate operations were run to get into that state. When Liquibase generates a changelog for that database, it basically has to ask the database 'what is the current state of things?' and so it would have a changeset that creates the table with all three columns.
It is possible to have liquibase generate smaller changelog files, but you should probably take a step back and ask yourself why you want to do that.

liquibase does not detect data changes

by using liquibase diff with diffTypes=data on 2 tables in mysql data changes would not detected. In one of the tables I change existing entry and insert on row on a table this changes are not detected by liquibase. Changes on structure no problem.
Here my liquibase diff call:
liquibase --diffTypes=data --driver=com.mysql.jdbc.Driver --url=jdbc:mysql://localhost:3306/magento --username=username --password=password diff --referenceUrl=jdbc:mysql://localhost:3306/marketing_magento --referenceUsername=username --referencePassword=password
changes are in the magento db
Here the result:
Product Name: EQUAL
Product Version: EQUAL
Missing Data(s): NONE
Unexpected Data(s): NONE
Changed Data(s): NONE
Liquibase 'diff' Successful
Regards, Karsten
Liquibase doesn't support this kind of data differencing. It can output the data in certain limited cases - the main one being where a table doesn't exist at all in one database.
Because the primary use case is for doing structural change management, design decisions were made to optimize the performance of that use case. Doing row-by-row data comparisons is very expensive, performance-wise, and tedious to do correctly.

executing a common sql file using liquibase

I have a situation to handle, i have my liquibase structured as per the best practices recommended. I have the change log xml structured as given below
Master XML
-->Release XML
-->Feature XML
-->changelog XML
In our application group, we run updateSQL to generate the consolidated sql file and get the changes executed through our DBA group.
However, the real problem I have is to execute a common set of sql statements during every iteration. Like
ALTER SESSION SET CURRENT_SCHEMA=APPLNSCHEMA
as the DBA executes the changes as SYSTEM but the target schema is APPLNSCHEMA.
How to include such common repeating statements in Liquibase changelog.
You would be able to write an extension (http://liquibase.org/extensions) that injects it in. If you need to do it per changeLog, it may work best to extend XMLChangeLogParser to automatically create and add a new changeSet that runs the needed SQL.
You could make a changeSet with the attribute 'runAlways' set to true and include the SQL.
As far as I know, there isn't a way to have Liquibase itself do this. I suggest that you wrap Liquibase with your favorite scripting language such that you run a command "generateSQLforThoseCrazyDBAs" that runs Liquibase and then prepends the SQL you need to the output created by Liquibase.