How do I start a dbt run from the dbt-cloud cli? - dbt

I have a dbt project, and the dbt-cloud cli. I have my account ID and my token. I want to do a dbt run.
When I try dbt-cloud job run ... it wants a job-id. But it doesn't seem to like the one I get from dbt-cloud job list
Do I need to create a job template sort of thing on the cloud version? or create a json file to use with dbt-cloud job create ? I haven't figured out how to use --execute-steps with the job create call...
ETA: AH, I seem to have too many accounts and was running from the wrong one? I think it's working now?

Related

DBT Test execution create an HTML File or GUI Reporter

Hello everyone I would like to ask for your assistance if there's a way I could create an HTML/GUI Report upon running "dbt test" command. Your response is highly appreciated.
This is the result I got upon running dbt test
There are some 3rd party libraries that will generate reports like the one you're looking for. One you can try is called Elementary Data (https://www.elementary-data.com/). When you run DBT tasks, it creates a schema with all of the stats on model execution timings, test results, etc. It then has a command to generate an HTML page using those results to publish some simple graphs and aggregate test results.

CICD pipeline for Bigquery Schema/ ddl deployment

I am looking for CI/CD solution for the Google Bigquery script.
The requirement is that I have a list of files with DDL script, design the CI/CD solution which should maintain the version, and deploy the script in Google Bigquery in auto/schedule based.
Since you want to use version control to commit the schema, you can use CI for Data in BigQuery CLI utility Github Repository which will help you in orchestrating the processes. For more information you can check this documentation. For implementing this, you can check this link.
Since you want to use CD, Cloud Build can be used with BigQuery where you can use your own custom builders for your requirement. You can also configure notifications for both BigQuery and GitHub using Cloud Build.
for product recommendation for CI, user cloud source repositories and for CD use cloud build
there are multiple ways to do deploy
option 1: here you are specifying inline query in the cloud build steps, this does not exactly takes into your latest version of SQL. see option 2 for latest version of sql
here you see $PROJECT_ID and $_DATASET these are dynamic variables that you set at run time by environment variables in cloud build also you can use same way to
— name: ‘gcr.io/cloud-builders/gcloud’
entrypoint: ‘bq’
id: ‘create entry min day view’
args:
— query
— --use_legacy_sql=false
— "CREATE OR REPLACE TABLE $PROJECT_ID.$_DATASET.TABLENAME
AS
SELECT 1"
option 2:
there is post for this here
on the last post above link, you can use use bash as entry point and pass bq arguments as args
hope this helps.

Automatically execute Redshift SQL code periodically stored on Github using Jenkins

I have a few reddshift sql scripts on Github which I need to execute in sequence every 30 mins.
e.g.
At 10:00 am
script001.sql to be executed first
script002.sql to be executed next and so on...
At 10:30 am
script001.sql to be executed first
script002.sql to be executed next and so on...
These scripts run on already existing tables in Redshift. Some of these scripts create tables which are used in the subsequent queries and hence order of execution is important to avoid "Table not found" error.
I have tried:
Creating a freestyle project in Jenkins with the following configuration:
General Tab --> GitHub Project and provided a Project URL
Source Code Management Tab --> Selected Git and provided Repository URL in the format https://personaltoken#github.com/site/repo.git
Branches to build Tab --> */main
Build Triggers Tab --> Build periodically (H/30 * * * *)
Now I don't know how to add Build Step to execute the query from GitHub. The configuration builds successfully but obviously does nothing as no steps have been defined.
Creating a pipeline project in Jenkins with the same configuration as above but without Pipeline script as I am not sure how to write a pipeline script to run redshift SQL stored on Github
I tried looking for a solution but couldn't find anything for Redshift. There are tutorials and snippets for SQL server and Oracle but their scripts are different and can't be used in Redshift.
Any help on this would be greatly appreciated.

Splitting Jenkins Job to run concurrently

Does anyone know of a way to split a single Jenkins job into parts and run them concurrently/parallel?
For example if I have a job that runs tests which take 30 minutes, is there a way I can break this job into three 10 minute runs that run at the same time but in three different instances
Thanks in advance.
Create new jobs, call it f.e. Test . You should select the job type based on the type of the root job.
If you have a Maven Job type, you can set the workspace directory under build -> advanced. Freestyle Job type has this option directly under project -> advanced.
Set for all jobs the same working directory. The root job will compile and all other jobs uses the same working directory to use the compiled output.
For the test jobs add the test execution as build step and differ here the tests which should be executed.
Edit your root job and remove there the excution of the long running tests. You can call there the three jobs now. But you need the Parameterized Trigger Plugin.
The downside of this way, you need enough jenkins executors to handle all tests jobs.
If you're using Jenkins 1.x, I would suggest trying the multijob plugin - I've successfully used it to split a single job into a parent job plus multiple child jobs:
https://wiki.jenkins-ci.org/display/JENKINS/Multijob+Plugin
If you're using Jenkins 2.x, then try out the pipeline feature :) It makes running parallel tasks very easy:
https://github.com/jenkinsci/pipeline-plugin/blob/master/TUTORIAL.md#creating-multiple-threads
If you want, I believe you can also use pipelines in Jenkins 1.x by means of a plugin. I haven't looked into that, though.

Is there a way to see when migrations were ran?

In one of our apps db:migrate is not set to automatically run on every deploy to Heroku. It hasn't been Continuously Integrate yet.
We've ran into an issue and for debugging purposes I want to see when a particular migration ran.
Is this possible?
Rails does not give such feature, when you run migration, time_stamp of migration is added in an array , to remember which migration is executed and which is not.
so what you can do is , log in to your database (psql if using postgres) and find created time or updated time of table manually .
this may help you do so https://stackoverflow.com/a/11868687/1970061 .