Liquibase - different versions of file - liquibase

I am orchestrating an automatic delivery process using GIT, Bamboo, Ansible and Liquibase.
I am having some issues trying to use Liquibase rollback feature. Basically I have the same files, my change set master and the version files (each having its own rollback section), in two different places, say an "upgrade" folder and a "rollback" folder. Even though the files are the same, the rollback simply does not work. Illustrating:
+ deployment_folder
+ update
- changeset-master.xml
- changeset-1.0.0.xml
- changeset-1.0.1.xml
+ rollback
- changeset-master.xml
- changeset-1.0.0.xml
- changeset-1.0.1.xml
The files have exactly the same content.
Running liquibase updates and tagging is fine:
$>liquibase --username=USR --password=*** --classpath=./ojdbc7.jar --driver=oracle.jdbc.driver.OracleDriver --url=jdbc:oracle:thin:#host:port:SID --changeLogFile=update/changeset-master.xml update
$>liquibase --username=USR --password=*** --classpath=./ojdbc7.jar --driver=oracle.jdbc.driver.OracleDriver --url=jdbc:oracle:thin:#host:port:SID --changeLogFile=update/changeset-master.xml tag 1.0.0
$>liquibase --username=USR --password=*** --classpath=./ojdbc7.jar --driver=oracle.jdbc.driver.OracleDriver --url=jdbc:oracle:thin:#host:port:SID --changeLogFile=update/changeset-master.xml update
$>liquibase --username=USR --password=*** --classpath=./ojdbc7.jar --driver=oracle.jdbc.driver.OracleDriver --url=jdbc:oracle:thin:#host:port:SID --changeLogFile=update/changeset-master.xml tag 1.0.1
However, when trying to rollback from 1.0.1 to 1.0.0 using the change set master in rollback folder it says "Liquibase Rollback Successful" but the changes are not rolled back. the rollbackSQL command also does not display any relevant SQL statement other than the DATABASECHANGELOGLOCK updates.
$>liquibase --username=USR --password=*** --classpath=./ojdbc7.jar --driver=oracle.jdbc.driver.OracleDriver --url=jdbc:oracle:thin:#host:port:SID --changeLogFile=rollback/changeset-master.xml rollback 1.0.0
Looks like the file has to be exactly the same (for checksum I suppose), which is a show blocker in my case, where I have to constantly pull versions from my source control system, so the files will never be "the same", although they have the same content. Is there any way to disable this verification in Liquibase? Currently I am using Liquibase 3.4.2.

Liquibase rollback doesn't work this way, you need to specify rollback instructions in <rollback> section of your script and execute it with rollback command: http://www.liquibase.org/documentation/rollback.html
UPD: Liquibase uses id based on changeset file name, change id and author id to identify changes. If you apply <change id="1" author="dbf">...</change> from file f1.xml it's 'virtual' id is <f1.xml, 1, dbf>. When you run f2.xml with the same content for rollback the id it calculates is <f2.xml, 1, dbf> so it doesn't match the first id and nothing is rolled back.

Related

How to specify the searchPath when using liquibase commandline liquibase.integration.commandline.Main

(Using liquibase 4.18.0 and also tried 4.19.0)
I want to add two additional parameters to my (working) liquibase call
--hub-mode=off
--searchPath="some/resources"
Working:
java liquibase.integration.commandline.Main --logLevel=info --defaultsFile=project.properties update
Not working:
java liquibase.integration.commandline.Main --logLevel=info --searchPath="some/resources" --defaultsFile=project.properties update
I always get:
Unknown option 'searchPath'
If I remove this option I get the same for hub-mode. If I remove both the resource could not be found and liquibase tells me
"More locations can be added with the 'searchPath' parameter."
I checked the declaredFields variable and there are the following options defined and the two are missing:
runningFromNewCli
newCliChangelogParameters
outputStream
LOG
coreBundle
classLoader
driver
username
password
url
hubConnectionId
hubProjectId
hubProjectName
databaseClass
defaultSchemaName
outputDefaultSchema
outputDefaultCatalog
liquibaseCatalogName
liquibaseSchemaName
databaseChangeLogTableName
databaseChangeLogLockTableName
databaseChangeLogTablespaceName
defaultCatalogName
changeLogFile
overwriteOutputFile
classpath
contexts
labels
labelFilter
driverPropertiesFile
propertyProviderClass
changeExecListenerClass
changeExecListenerPropertiesFile
promptForNonLocalDatabase
includeSystemClasspath
defaultsFile
diffTypes
changeSetAuthor
changeSetContext
dataOutputDirectory
referenceDriver
referenceUrl
referenceUsername
referencePassword
referenceDefaultCatalogName
referenceDefaultSchemaName
currentDateTimeFunction
command
commandParams
logLevel
logFile
changeLogParameters
outputFile
excludeObjects
includeCatalog
includeObjects
includeSchema
includeTablespace
deactivate
outputSchemasAs
referenceSchemas
schemas
snapshotFormat
liquibaseProLicenseKey
liquibaseProLicenseValid
liquibaseHubApiKey
liquibaseHubUrl
managingLogConfig
outputsLogMessages
sqlFile
delimiter
rollbackScript
rollbackOnError
suspiciousCodePoints
Any idea how to specify the searchpath for the commandline executable?
I did read this post but the solution did not help.

Deploy sql workflow with DBX

I am developing deployment via DBX to Azure Databricks. In this regard I need a data job written in SQL to happen everyday. The job is located in the file data.sql. I know how to do it with a python file. Here I would do the following:
build:
python: "pip"
environments:
default:
workflows:
- name: "workflow-name"
#schedule:
quartz_cron_expression: "0 0 9 * * ?" # every day at 9.00
timezone_id: "Europe"
format: MULTI_TASK #
job_clusters:
- job_cluster_key: "basic-job-cluster"
<<: *base-job-cluster
tasks:
- task_key: "task-name"
job_cluster_key: "basic-job-cluster"
spark_python_task:
python_file: "file://filename.py"
But how can I change it so I can run a SQL job instead? I imagine it is the last two lines of code (spark_python_task: and python_file: "file://filename.py") which needs to be changed.
There are various ways to do that.
(1) One of the most simplest is to add a SQL query in the Databricks SQL lens, and then reference this query via sql_task as described here.
(2) If you want to have a Python project that re-uses SQL statements from a static file, you can add this file to your Python Package and then call it from your package, e.g.:
sql_statement = ... # code to read from the file
spark.sql(sql_statement)
(3) A third option is to use the DBT framework with Databricks. In this case you probably would like to use dbt_task as described here.
I found a simple workaround (although might not be the prettiest) to simply change the data.sql to a python file and run the queries using spark. This way I could use the same spark_python_task.

How to fetch a launch plan using flyte api's without specifying a sha?

I would like to use the flyte api's to fetch the latest a launchplan for a deployment environment without specifying the sha.
Users are encouraged to specify the SHA when referencing Launch Plans or any other Flyte entity. However, there is one exception. Flyte has the notion of an active launch plan. For a given project/domain/name combination, a Launch Plan can have any number of versions. All four fields combined identify one specific Launch Plan. Those four fields are the primary key. One, at most one, of those launch plans can also be what we call 'active'.
To see which ones are active, you can use the list-active-launch-plans command in flyte-cli
(flyte) captain#captain-mbp151:~ [k8s: flytemain] $ flyte-cli -p skunkworks -d production list-active-launch-plans -l 200 | grep TestFluidDynamics
NONE 248935c0f189c9286f0fe13d120645ddf003f339 lp:skunkworks:production:TestFluidDynamics:248935c0f189c9286f0fe13d120645ddf003f339
However, please be aware that if a launch plan is active, and has a schedule, that schedule will run. There is no way to make a launch plan "active" but disable its schedule (if it has one).
If you would like to set a launch plan as active, you can do so with the update-launch-plan command.
First find the version you want (results truncated):
(flyte) captain#captain-mbp151:~ [k8s: flytemain] $ flyte-cli -p skunkworks -d staging list-launch-plan-versions -n TestFluidDynamics
Using default config file at /Users/captain/.flyte/config
Welcome to Flyte CLI! Version: 0.7.0b2
Launch Plan Versions Found for skunkworks:staging:TestFluidDynamics
Version Urn Schedule Schedule State
d4cf71c20ce987a4899545ae01286f42297a8f3b lp:skunkworks:staging:TestFluidDynamics:d4cf71c20ce987a4899545ae01286f42297a8f3b
9d3e8d156f7ba0c9ac338b5d09949e88eed1f6c2 lp:skunkworks:staging:TestFluidDynamics:9d3e8d156f7ba0c9ac338b5d09949e88eed1f6c2
248935c0f189c928b6ffe13d120645ddf003f339 lp:skunkworks:staging:TestFluidDynamics:248935c0f189c928b6ffe13d120645ddf003f339
...
Then
flyte-cli update-launch-plan --state active -u lp:skunkworks:staging:TestFluidDynamics:d4cf71c20ce987a4899545ae01286f42297a8f3b

How to get information on latest successful pod deployment in OpenShift 3.6

I am currently working on making a CICD script to deploy a complex environment into another environment. We have multiple technology involved and I currently want to optimize this script because it's taking too much time to fetch information on each environment.
In the OpenShift 3.6 section, I need to get the last successful deployment for each application for a specific project. I try to find a quick way to do so, but right now I only found this solution :
oc rollout history dc -n <Project_name>
This will give me the following output
deploymentconfigs "<Application_name>"
REVISION STATUS CAUSE
1 Complete config change
2 Complete config change
3 Failed manual change
4 Running config change
deploymentconfigs "<Application_name2>"
REVISION STATUS CAUSE
18 Complete config change
19 Complete config change
20 Complete manual change
21 Failed config change
....
I then take this output and parse each line to know which is the latest revision that have the status "Complete".
In the above example, I would get this list :
<Application_name> : 2
<Application_name2> : 20
Then for each application and each revision I do :
oc rollout history dc/<Application_name> -n <Project_name> --revision=<Latest_Revision>
In the above example the Latest_Revision for Application_name is 2 which is the latest complete revision not building and not failed.
This will give me the output with the information I need which is the version of the ear and the version of the configuration that was used in the creation of the image use for this successful deployment.
But since I have multiple application, this process can take up to 2 minutes per environment.
Would anybody have a better way of fetching the information I required?
Unless I am mistaken, it looks like there are no "one liner" with the possibility to get the information on the currently running and accessible application.
Thanks
Assuming that the currently active deployment is the latest successful one, you may try the following:
oc get dc -a --no-headers | awk '{print "oc rollout history dc "$1" --revision="$2}' | . /dev/stdin
It gets a list of deployments, feeds it to awk to extract the name $1 and revision $2, then compiles your command to extract the details, finally sends it to standard input to execute. It may be frowned upon for not using xargs or the like, but I found it easier for debugging (just drop the last part and see the commands printed out).
UPDATE:
On second thoughts, you might actually like this one better:
oc get dc -a -o jsonpath='{range .items[*]}{.metadata.name}{"\n\t"}{.spec.template.spec.containers[0].env}{"\n\t"}{.spec.template.spec.containers[0].image}{"\n-------\n"}{end}'
The example output:
daily-checks
[map[name:SQL_QUERIES_DIR value:daily-checks/]]
docker-registry.default.svc:5000/ptrk-testing/daily-checks#sha256:b299434622b5f9e9958ae753b7211f1928318e57848e992bbf33a6e9ee0f6d94
-------
jboss-webserver31-tomcat
registry.access.redhat.com/jboss-webserver-3/webserver31-tomcat7-openshift#sha256:b5fac47d43939b82ce1e7ef864a7c2ee79db7920df5764b631f2783c4b73f044
-------
jtask
172.30.31.183:5000/ptrk-testing/app-txeq:build
-------
lifebicycle
docker-registry.default.svc:5000/ptrk-testing/lifebicycle#sha256:a93cfaf9efd9b806b0d4d3f0c087b369a9963ea05404c2c7445cc01f07344a35
You get the idea, with expressions like .spec.template.spec.containers[0].env you can reach for specific variables, labels, etc. Unfortunately the jsonpath output is not available with oc rollout history.
UPDATE 2:
You could also use post-deployment hooks to collect the data, if you can set up a listener for the hooks. Hopefully the information you need is inherited by the PODs. More info here: https://docs.openshift.com/container-platform/3.10/dev_guide/deployments/deployment_strategies.html#lifecycle-hooks

Generating TPC-DS database for sql server

How do I populate the Transaction Processing Performance Council's TPC-DS database for SQL Server? I have downloaded the TPC-DS tool but there are few tutorials about how to use it.
In case you are using windows, you gotta have visual studio 2005 or later. Unzip dsgen in the folder tools there is dsgen2.sln file, open it using visual studio and build the project, will generate tables for you, I've tried that and I loaded tables manually into sql server
I've just succeeded in generating these queries.
There are some tips may not the best but useful.
cp ${...}/query_templates/* ${...}/tools/
add define _END = ""; to each query.tpl
${...}/tools/dsqgen -INPUT templates.lst -OUTPUT_DIR /home/query99/
Let's describe the base steps:
Before go to the next steps double-check that the required TPC-DS Kit has not been already prepared for your DB
Download TPC-DS Tools
Build Tools as described in 'v2.11.0rc2\tools\How_To_Guide-DS-V2.0.0.docx' (I used VS2015)
Create DB
Take the DB schema described in tpcds.sql and tpcds_ri.sql (they located in 'v2.11.0rc2\tools\'-folder), suit it to your DB if required.
Generate data that be stored to database
# Windows
dsdgen.exe /scale 1 /dir .\tmp /suffix _001.dat
# Linux
dsdgen -scale 1 -dir /tmp -suffix _001.dat
Upload data to DB
# example for ClickHouse
database_name=tpcds
ch_password=12345
for file_fullpath in /tmp/tpc-ds/*.dat; do
filename=$(echo ${file_fullpath##*/})
tablename=$(echo ${filename%_*})
echo " - $(date +"%T"): start processing $file_fullpath (table: $tablename)"
query="INSERT INTO $database_name.$tablename FORMAT CSV"
cat $file_fullpath | clickhouse-client --format_csv_delimiter="|" --query="$query" --password $ch_password
done
Generate queries
# Windows
set tmpl_lst_path="..\query_templates\templates.lst"
set tmpl_dir="..\query_templates"
set dialect_path="..\..\clickhouse-dialect"
set result_dir="..\queries"
set tmpl_name="query1.tpl"
dsqgen /input %tmpl_lst_path% /directory %tmpl_dir% /dialect %dialect_path% /output_dir %result_dir% /scale 1 /verbose y /template %tmpl_name%
# Linux
# see for example https://github.com/pingcap/tidb-bench/blob/master/tpcds/genquery.sh
To fix the error 'Substitution .. is used before being initialized' follow this fix.