Automatically execute Redshift SQL code periodically stored on Github using Jenkins

Automatically execute Redshift SQL code periodically stored on Github using Jenkins - sql

I have a few reddshift sql scripts on Github which I need to execute in sequence every 30 mins.
e.g.
At 10:00 am
script001.sql to be executed first
script002.sql to be executed next and so on...
At 10:30 am
script001.sql to be executed first
script002.sql to be executed next and so on...
These scripts run on already existing tables in Redshift. Some of these scripts create tables which are used in the subsequent queries and hence order of execution is important to avoid "Table not found" error.
I have tried:
Creating a freestyle project in Jenkins with the following configuration:
General Tab --> GitHub Project and provided a Project URL
Source Code Management Tab --> Selected Git and provided Repository URL in the format https://personaltoken#github.com/site/repo.git
Branches to build Tab --> */main
Build Triggers Tab --> Build periodically (H/30 * * * *)
Now I don't know how to add Build Step to execute the query from GitHub. The configuration builds successfully but obviously does nothing as no steps have been defined.
Creating a pipeline project in Jenkins with the same configuration as above but without Pipeline script as I am not sure how to write a pipeline script to run redshift SQL stored on Github
I tried looking for a solution but couldn't find anything for Redshift. There are tutorials and snippets for SQL server and Oracle but their scripts are different and can't be used in Redshift.
Any help on this would be greatly appreciated.

Related

Determine which SQL script file is being used

I'm starting to learn SQL on class and using SQL command line to start some of the pre-built sql files.
ie
start "directory/file_1.sql"
problem arose during class last week when we were switching to and from multiple different script and I lost track which file I was working on.
at this time, I'm using another prebuilt script that drops all table then start the second script
Is there a way to check which file is currently being used?
//edit
Using sql plus tool for oracle 12c
//edit2
In ongoing confusion of which program I am using, I've attached the screenshot
//edit3
location of the script file is within windows10
so once I login as scott I do
start "C:\Users\Documents\Oracle SQL\JLDB_Build.sql"
but sometime have to work on
start "C:\Users\Documents\Oracle SQL\JLDB_Build_5.sql"
then I forget if I'm working on build or build_5

flyway database script logging

I am currently evaluating Flyway software as a deployment option for our
company. We run our database deployments on an ORACLE database and
currently spool the output from a sqlplus session for logging purposes. We
use this to verify feedback information such as were objects created
successfully, were packages and functions, etc. compiled without errors,
verify amount of records entered and so forth.
Is there similar logging functionality in Flyway? Currently the only
logging we have found is in the server logs. We can tell from these logs
that a script has completed successfully or has triggered an ORA error but
we are curious as to whether this is the extent of the database logging
options or not.
Thank you,

We used the command line method for running flyway and turned on debug output (-X). Along with a lot of other output it also logs more information about the SQL migrations run (eg content of repeatable migrations) and the number of records affected. This is not perfect however it helped us a lot in capturing more information about what was applied.
See https://flywaydb.org/documentation/commandline/ as it is not documented for each individual command as it applies to flyway itself.

Increment Version number when checking in to TFS

I have a folder in TFS which has SQL Scripts. At the moment I am manually adding a comment and updating a version number inside the comment every time i make a change and check it back it. This works however was hoping there might be a better way. Is there a way to automate this in TFS?
I have read the following article
Version control project files
do i have to go through such a process for simple .sql files? Are there any other simple ways.

There are a few ways you can do this:
Create an automated build in TFS and write a custom build step / PowerShell script to parse the appropriate SQL scripts, read the version, increment it, and store the new version by either checking in the updated file or a local store
Use a database project (part of SQL Server Data Tools) which will output a DACPAC. Inside the database project, you can set the version as specified here. This stores the version in the project file. If you update your TFS build number to be digits only, you can then update the project file to set that value to match the build using a custom build task. For example, if your build number was yyyy.m.d.R where R is the number of times that build was run today (TFS manages that - it's the revision variable). Or, you could set the the <DacVersion> tag to something like 2.1.0.0 and your build replaces the last digit with yyyymmddr.
I'd recommend using a database project. It's pretty easy to create a new database project off an existing database.

The first way mentioned by Jacob above can achieve that if you just want to incremental the version number of the script/folder, just create a CI build definition.
Actually you can just enable Label sources and set the Label format with predefined environment variables such as $(build.buildNumber), and set without publish any artifacts during build process.
Thus, it will automatically trigger the CI build when you check in files, and the source (SQL Script /folder) will be labeled with the incremental number.
Then you can find the specific versions with the label.

How to automate source control with Oracle database

I work in an Oracle instance that has hundreds of schemas and multiple developers. We have a development instance where developers can integrate their work before test or production.
We want to have source control for all the DDL run in this integrated development database. Currently this is done through a product Red Gate which we run manually after we make a change to the database. Redgate finds the changes between what is in the schema and what was last checked into source control and makes a script of the differences and puts this into source control.
The problem however is of course that running regdate can take some time and people run it infrequently or not at all for small changes. Also redgate will only look in one schema at a time and it would be VERY time consuming to manually run it against all schemas to guarantee that they are up to date. However if the source controlled code cannot be relied upon it becomes less useful...
What would seem to be ideal would be to have some software that could periodically (even once a day), or when triggered by DDL being run, update the source control (preferably github as this is used by other teams) from all the schemas.
I cannot seem to see any existing software which can be simply used to do this.
Is there a problem with doing this? (there is no need to address multiple developers overwriting each others work on the same day as we have this covered in a separate process) Is anyone doing this? Can anyone recommend a way to do this?

We do this with help of a PL/SQL function, a python script and a shell script:
The PL/SQL function can generate the DDL of a whole schema and returns this as CLOB
The python script connects to the database, fetches the DDL and stores it in files
The shell script runs the Source Control to add the modifications (we use Bazaar here).
You can see the scripts on PasteBin:
The PL/SQL function is here: http://pastebin.com/AG2Fa9zL
The python program (schema_exporter.py): http://pastebin.com/nd8Lf0gK
The shell script:
The shell script:
python schema_exporter.py
d=$(date +%Y-%m-%d__%H_%M_%S)
bzr add
bzr st | grep -q -E 'added|modified' && commit -m "Database objects on $d"
exit 0
This shell script is configured to run from cron every day.

Being in the database version control space for 5 years (as director of product management at DBmaestro) and having worked as a DBA for over two decades, I can tell you the simple fact that you cannot treat the database objects as you treat your Java, C# or other files and save the changes in simple DDL scripts.
There are many reasons and I'll name a few:
Files are stored locally on the developer’s PC and the change s/he
makes do not affect other developers. Likewise, the developer is not
affected by changes made by her colleague. In database this is
(usually) not the case and developers share the same database
environment, so any change that were committed to the database affect
others.
Publishing code changes is done using the Check-In / Submit Changes /
etc. (depending on which source control tool you use). At that point,
the code from the local directory of the developer is inserted into
the source control repository. Developer who wants to get the latest
code need to request it from the source control tool. In database the
change already exists and impacts other data even if it was not
checked-in into the repository.
During the file check-in, the source control tool performs a conflict
check to see if the same file was modified and checked-in by another
developer during the time you modified your local copy. Again there
is no check for this in the database. If you alter a procedure from
your local PC and at the same time I modify the same procedure with
code form my local PC then we override each other’s changes.
The build process of code is done by getting the label / latest
version of the code to an empty directory and then perform a build –
compile. The output are binaries in which we copy & replace the
existing. We don't care what was before. In database we cannot
recreate the database as we need to maintain the data! Also the
deployment executes SQL scripts which were generated in the build
process.
When executing the SQL scripts (with the DDL, DCL, DML (for static
content) commands) you assume the current structure of the
environment match the structure when you create the scripts. If not,
then your scripts can fail as you are trying to add new column which
already exists.
Treating SQL scripts as code and manually generating them will cause
syntax errors, database dependencies errors, scripts that are not
reusable which complicate the task of developing, maintaining,
testing those scripts. In addition, those scripts may run on an
environment which is different from the one you though it would run
on.
Sometimes the script in the version control repository does not match
the structure of the object that was tested and then errors will
happen in production!
There are many more, but I think you got the picture.
What I found that works is the following:
Use an enforced version control system that enforces
check-out/check-in operations on the database objects. This will
make sure the version control repository matches the code that was
checked-in as it reads the metadata of the object in the check-in
operation and not as a separated step done manually. This also allow
several developers to work in parallel on the same database while
preventing them to accidently override each other code.
Use an impact analysis that utilize baselines as part of the
comparison to identify conflicts and identify if a difference (when
comparing the object's structure between the source control
repository and the database) is a real change that origin from
development or a difference that was origin from a different path and
then it should be skipped, such as different branch or an emergency
fix.
Use a solution that knows how to perform Impact Analysis for many
schemas at once, using UI or using API in order to eventually
automate the build & deploy process.
An article I wrote on this was published here, you are welcome to read it.

To me it seems like your way of working is backwards: developers run DDL against the DB in an unordered fashion and then you need an automated tool for inferring the changes (and the DDL) that was run.
The process would be in better control if you did the following instead:
Developers write DDL as SQL scripts, preferably using a migration tool such as Flyway (http://flywaydb.org/documentation/migration/sql.html).
Migration scripts are checked into version control
Migration scripts are periodically run against the DB (e.g. by the migration tool)
In this workflow, the DB would only get altered through automated migration scripts and no-one is allowed to do changes manually. Could this work for you?

(I develop the Oracle tools for Redgate)
Actually using the tools you can already what I think you're asking for using Schema Compare for Oracle.
You can compare multiple schemas either in the UI or via the command line - I think what you're after is automating the command line tool which can create difference scripts, sync between source and destination (live, snapshot or scripts) and generate reports.
You can automate the command line to sync to a scripts folder which is your source code checkout and then subsequently run a command to commit the changes.
I think that's all good :)

We built a commerical tool that bridges Oracle with Git. It helps you manage your database objects with Git. Basically, the database becomes the working directory for the developer. You can perform git operations in the database such as reset, commit, branch, merge etc... and the database code is updated automatically. It might be worth taking a look: https://www.gitora.com

Maven execution to perform several plugins in sequence

I'm working on something that is using Hibernate for database access. I've got everything set up and working so that I can use mvn hibernate3:hbm2ddl to build the database schema, and I'm using mvn liquibase:update to populate initial data into the database (DBUnit was my first try but I couldn't get it to work with Oracle and Liquibase just worked first time).
My problem is that if I execute hbm2ddl to drop and re-create the schema then the Liquibase DATABASECHANGELOG tables are left intact, meaning that Liquibase won't re-create the data the next time it's run. To get around this I've configured up mvn sql:execute to perform a drop on the two tables in question, but this means that to be safe if I want to build the database from scratch I now need to execute "mvn hibernate3:hbm2ddl sql:execute liquibase:update"
What I'd really like is to be able to configure up something that will execute the sql:execute command when the hibernate3:hbm2ddl command is run, so that I know that doing that one command will leave me in a nice clean database state. Failing that, a configuration that will run a number of commands in sequence automatically, so I could configure up for example "mvn execute:db-rebuild" to run the three commands above automatically.
I've seen mention of mojo-executor but no examples on actually how to use it. I'm not even sure if it's the right tool for what I want...

Why don't you bind these different things to a particular thing like the integration-test phase. The order of the plugins will define the order of exectutions. Than you get rid of hand calling mvn ...

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas