dbt depends on a source not found - dbt

Could you please help me with this issue?
Encountered an error:
Compilation Error in model metrics_model (models\example\metrics_model.sql)
Model 'model.test_project.metrics_model' (models\example\metrics_model.sql) depends on a source named 'automate.metrics' which was not found
I am having this monotonous error, which I have not been able to solve.
Many thanks beforehand!

This is due to the automate.metrics table missing from the database (either the dbt project’s target database or a different database on the same server). There should be a source.yml or automate.yml file somewhere in your project that defines the source. FYI automate is the schema name and metrics is the table name.
If the source yml file specifies a database for the automate schema, query that database to make sure that the metrics table exists in the automate schema.
If the source yml file doesn’t list a database, then that schema / table should exist in the dbt project’s target database. You can see what the target database is by looking at the profile for your project setup in ~/.dbt/profiles.yml.

For PostgreSQL database please check if the sources.yml file is defined as follows:
version: 2
sources:
- name: name_of the source
schema: name_of_the_schema
quoting:
database: false
schema: false
identifier: false
loader: stitch
tables:
- name: name_of_table1
- name: name_of_table2

Are you seeing this in your dev environment? It's possible that you've not run dbt run after creating the automate.metrics which is preventing metrics_model from referencing it.

Check whether you put source config in the right yaml file. I encountered this issue and tried every solutions including above one. Then finally I forgot to put suffix .yml in the source file, and when dbt can't locate source config in that file.

Related

How to rename changelog files in Liquibase?

I'm using Liquibase for my project and would like to rename my changelog files.
From old structure:
databaseChangeLog:
- include:
file: db/changelog/add_pokemons.sql
- include:
file: db/changelog/add_minions.sql
To new structure:
databaseChangeLog:
- include:
file: db/changelog/v001__add_table_pokemons.sql
- include:
file: db/changelog/v002__add_table_minions.sql
The Liquibase documentation unfortunately does not state best practices for such intention.
I was thinking about duplicating my files so I have the old ones plus new ones in one directory and then write a migration to change column filename in the databasechangelog table.
What would be the best approach in your opinion?
The problem here is that liquibase holds the name of file inside databasechangelog.filename. So if you are not using logicalFilePath yet maybe you have a chance.
If those files you've posted are formatted sql files then you could do something like this (example for file db/changelog/add_pokemons.sql):
--liquibase formatted sql logicalFilePath:db/changelog/add_pokemons.sql
and you can rename the actual file to whatever you want.
In that case nothing should be broken inside existing databasechangelog table.
Other than that there is probably no available change that will do that automatically from liquibase.

'DBT docs generate' does not populate model column-level comments in the catalog

I use dbt-snowflake 1.1.0 with the corresponding dbt-core 1.1.0.
I added documentation for my models in yaml files, i.e.:
> models/stage/docs.yml
version: 2
models:
- name: raw_weblogs
description: Web logs of customer interaction, broken down by attribute (bronze). The breakdown is performed using regular expressions.
columns:
- name: ip
description: IP address from which the request reached the server (might be direct customer IP or the address of a VPN/proxy).
...
Although these details show up correctly in the DBT UI when i run dbt docs generate and then dbt docs serve, yet they are not listed in target/catalog.json:
cat target/catalog.json | grep identity
(no results)
According to the DBT documentation, I understand that column comments should be part of catalog.json.
LATER EDIT: I tried running dbt --debug docs generate and it seems that all data is retrieved directly from the target environment (in my case, Snowflake). Looking at the columns of my model in Snowflake, they indeed do NOT have any comments posted on the in Snowflake.
It thus seems to me that the underlying error might be with the fact that dbt run does not correctly persist the column metadata to Snowflake.
After further investigation, I found out the reason for lacking comments was indeed the fact that the comments are written to catalog.json when running dbt docs generate based on what is received from the database, while dbt docs serve populates the UI by combining information from catalog.json with metadata (in my case, documented column comments) from the local dbt models.
The solution to persist such metadata in the database with dbt run was to add the following DBT configuration:
> dbt_project.yml
models:
<project>:
...
+persist_docs:
relation: true
columns: true

How to best implement dynamic dbt datasets

I'm cleaning up a dbt + BigQuery environment and trying to implement a staging environment that pulls from a staging dataset. Problem is that the current .yml files with source information all explicitly point to a production dataset.
One option that I am considering is a source wrapper function that will serve as an adapter and inject the proper dataset depending on some passed CLI var or profile target (which is different for the staging vs prod environments).
However, I'm fairly new to dbt so unsure if this is the best way to go about this. Would appreciate any insight you kind folks have :)
EDIT: I'm realizing that a source wrapper is not the way to go as it would mess with the generated DAG
You can supply the name of the schema for a source in a variable or environment variable, and set that variable at runtime.
In your sources.yml:
version: 2
sources:
- name: jaffle_shop
schema: "{{ var('source_jaffle_shop_schema) }}"
tables:
- name: orders
In your dbt_project.yml:
vars:
- source_jaffle_shop_schema: MY_DEFAULT_SCHEMA
And then to override at runtime:
dbt run --vars "{source_jaffle_shop_schema: my_other_schema}"

DBT manifest.json data type is null

I'm creating a lot of stuff based on the manifest.json that dbt generates for me. But for whatever reason the "data_type" property for each column is always None in the manifest.json, even though I can see it in the catalog.json, I believe the data type is generated from the database.
How do I get the data_type attribute populated in my manifest.json file ?
Some helpful answers from this dbt Slack thread:
first reply (h/t Daniel Luftspring)
not sure if this is the only way but i'm running dbt version 0.20.1 you can specify the data_type as a column property in your schema.yml and it will show up in the manifest like so:
columns:
- name: city
data_type: string
If you have a big project and wanted to automate this you could probably pull together a script to edit your schema files in place and sync the data types with your db using the information schema
second reply (h/t Jonathon Talmi)
FYI catalog.json has data type becaause it queries the metadata data
tables in your dwh (e.g. info schema in snowflake) to contruct the
catalog, but your traditional dbt compile/run/etc which. generates a
manifest does not have such queries

Run a initial Liquibase script

This is my 2nd day using Liquibase.
I have a 'backup' or 'Repositry' with the database that I need to create locally on my PC.
I have looked at the documentation, but Im realy not 100% clear on how to run it.
Ive updated the Liquibase.properties file to reflect the correct paths and username and passwords.
How do you run the update command to generate the tables and test data.
Windows 7
The Liquibase documentation on 'Adding Liquibase to an existing project' is probably the best place to start. Basically, you want to set the properties file so that it refers to the existing 'backup' database, and then run liquibase generateChangeLog
This will connect to the existing database and generate a file that contains the structure of the existing database expressed (typically) in an XML file called a changelog. You then create a new properies file that will connect to your local database and use liquibase update to apply the changelog to the local database and populate the structure. Note that this does not typically transfer the data from the existing database to the new database, just the structure - the tables, keys, indexes, etc. If you want to have test data as well, you can either export that data from the existing database, or you might look into crafting the changesets manually. To export the data, a command like this would be used:
java -jar liquibase.jar --changeLogFile="./data/<insert file name> " --diffTypes="data" generateChangeLog