Macro for custom schema names doesn't apply in a dbt package - dbt

I have an issue using custom schema names in a dbt package.
I use the Macro provided in dbt documentation.
{% macro generate_schema_name(custom_schema_name, node) -%}
{%- set default_schema = target.schema -%}
{%- if custom_schema_name is none -%}
{{ default_schema }}
{%- else -%}
{{ default_schema }}_{{ custom_schema_name | trim }}
{%- endif -%}
{%- endmacro %}
I put this macro in my dbt package here dbt package.
Finally I use this dbt package in another dbt project dbt project.
Here is my dbt_project.yml in my dbt project :
name: 'covid_france'
version: '0.0.1'
config-version: 2
profile: 'default'
source-paths: ["models"]
analysis-paths: ["analysis"]
test-paths: ["tests"]
data-paths: ["data"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
target-path: "target"
clean-targets:
- "target"
- "dbt_modules"
My dbt_project.yml in my dbt package :
name: 'covid_france'
version: '0.0.1'
config-version: 2
profile: 'default'
source-paths: ["models"]
analysis-paths: ["analysis"]
test-paths: ["tests"]
data-paths: ["data"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
target-path: "target"
clean-targets:
- "target"
- "dbt_modules"
models:
covid_france:
stg:
materialized: table
schema: stg
ods:
materialized: table
process-airbyte-outputs:
schema: ods
unions:
schema: ods
prs:
materialized: view
When I try to run my dbt project, it imports the dbt package but doesn't apply the macro that is supposed to remove the main schema prefix (provided in profiles.yml) from custom schema names
For instance : the schema provided in my profiles.yml is "prs". I have other custom schemas named ods and stg. But when dbt run, it create prs, prs_ods and prs_stg.
The macro used to work fine when I use it directly in a dbt project (instead of putting it in a dbt package that I use in my dbt project)
Thank you in advance !

In the docs: https://docs.getdbt.com/docs/building-a-dbt-project/building-models/using-custom-schemas#changing-the-way-dbt-generates-a-schema-name
It says:
Note: dbt ignores any custom generate_schema_name macros that are part of a package installed in your project.
So the way around that would be to create a small "shim" or thin-wrapper directly in your project that calls the macro in your package.
I think it's a bit confusing that your package project name is the same as your actual project name (both line 1 of the dbt_project.yml files), so I would name them differently for clarity.
e.g. assuming you rename the package to package_project_name, Use the macro code as you have already in your package, but in your project add another macro like
{% macro generate_schema_name(custom_schema_name, node) -%}
{{- package_project_name.generate_schema_name(custom_schema_name, node) }}
{%- endmacro %}

I believe you need also to define the target of the jobs, as the macro depends on the target schema.
For example
dbt run --models my_model --target dev
In a dbt cloud job, you can define as
"settings": {"threads": 1,"target_name": "prod"},

Related

Env var required but not provided - dbt CLI

We have an environment variable set in dbtCloud called DBT_SNOWFLAKE_ENV that selects the right database depending on which environment is used.
At the moment, I'm trying to set up dbt CLI with VSCode. I created a profiles.yml file that looks like this:
default:
target: development
outputs:
development:
type: snowflake
account: skxxxx.eu-central-1
user: <name>
password: <pass>
role: sysadmin
warehouse: transformations_dw
database: " {{ env_var('DBT_SNOWFLAKE_ENV', 'analytics_dev') }} "
schema: transformations
threads: 4
I added the env_var line after some suggestions but I realise that the environment variable still doesn't exist yet. The problem I see is that if I hardcode analytics_dev in that place (which makes sense), the error still persists.
I wouldn't want anybody who's going to use dbt to have to change the environment variable if they want to run something on production.
What are my options here?
You can set up a source file for the variables on dbt cli - for example you would create a bash script called set_env_var.sh and then source set_env_var.sh in your terminal.
An example of the bash script would be:
export SNOWFLAKE_ACCOUNT=xxxxx
export SNOWFLAKE_USER=xxxxx
export SNOWFLAKE_ROLE=xxxx
export SNOWFLAKE_SCHEMA=xxxx
export SNOWFLAKE_WAREHOUSE=xxxxx
and in your profiles.yml you can add all the variables you want, for example..
warehouse: "{{ env_var('SNOWFLAKE_WAREHOUSE') }}"
database: "{{ env_var('SNOWFLAKE_DATABASE') }}"
Hope this helps.
First of all younyou have to give hard code database name the other syntax is wrong. Secondly try to make a dynamic variable for environment and then use it like this when you want to use dbt, mean
**DBT snapshot --profile --vars $DBT_SNOWFLAKE_ENV **
As when you run it can easily pick up from env.
Currently i am working on dbt with everything dynamic even the full profile is dynamic according to schema and db.
In my case in my DBT model my variable was declared as part of vars within my dbt_project.yml file, so instead of accessing the variable like
"{{ env_var('MY_VARIABLE') }}"
I should have used:
"{{ var('MY_VARIABLE') }}"

How to modify variable inside ansible jinja2 template

I am passing a variable called x_version=v5.5.9.1 to the Ansible jinja2 template(bash).
But inside the receiving bash script (jinja2) variable x_version should be modified to v5.5.9.
version_defined_in_ansible={{ x_version }}
Below modification helped me.
version_defined_in_ansible=v{{ x_version.split('v')[1][0:5] }}
Given the variable
x_version: v5.5.9.1
The simplest approach is to split the extension
{{ x_version|splitext|first }}
evaluates to
v5.5.9

YAML Variables, can you reference variables within variables?

I am using a variables.yml file as a template to store different variables, and I was curious if I am able to reference variables within the yml file itself to essentially nest them.
For example:
#variables.yml file
variables:
food1: 'pineapple'
food2: 'pizza'
favoriteFood: '$(food1) $(food2)'
So that when I eventually call upon this variable "favoriteFood", I can just use ${{ variables.favoriteFood }} and the value should be "pineapple pizza"
Example:
#mainPipeline.yml file
variables:
- template: 'variables.yml'
steps:
- script: echo My favorite food is ${{ variables.favoriteFood }}.
Am I on the right track here? I can't seem to google to any examples of if this is possible.
Yes! It is in fact possible, just follow the syntax outlined above. Don't forget spacing is critical in YML files.

PyCharm: Reformat code breaks django template

If I run "reformat code" PyCharm changes this line:
{% ajax_dialog_opener url=duplicate_url|add:'?hide_messages=true' reload_on_success=False label='FoooBaar' dialog_title='Foo foo baaar' type='link'
After reformat code:
{% ajax_dialog_opener url=duplicate_url|add:'?hide_messages=true' reload_on_success=False label='FoooBaar' dialog_title='Foo foo baaar' type='link'
data_shortcut="mod+d" %}
But this means the new code is broken.
Is there a way to stop PyCharm from breaking above line?
Version: PyCharm community 2018.2
Go to
Preferences > Editor > Code Style
Change "Hard Wrap At" value to something bigger like 1000 (max).

How to use the value of an Ansible variable to target and get the value of a hostvar?

I'm in a situation where I have multiple Ansible roles, using multiple group_vars. Spread around each host's vars (depending on the host) is a number of directory paths, each in different places within the hostvar tree.
I need to ensure that a certain number of these directories exist when provisioning. So I created a role that uses the file module to ensure that these directories exist. Well, it would do, if I could figure out how to get it to work.
I have a group_var something similar to:
ensure_dirs:
- "daemons.builder.dirs.pending"
- "processor.prep.logdir"
- "shed.logdir"
Each of these 3 values maps directly to a group var that contains a string value that represents the corresponding filesystem path for that var, for example:
daemons:
builder:
dirs:
pending: /home/builder/pending
I would like to somehow iterate over ensure_dirs and evaluate each item's value in order to resolve it to the FS path.
I've tried several approaches, but I can't seem to get the value I need. The following is the most success I've had, which simply returns the literal of the constructed string.
- file:
dest: "hostvars['{{ ansible_hostname }}']['{{ item.split('.') | join(\"']['\") }}']"
state: directory
with_items: "{{ ensure_dirs }}"
This results in directories named, for example, hostvars['builder']['daemons']['builder']['dirs']['pending'] in the working directory. Of course, what I want the file module to work with the the value stored at that path in the hostvars, so that it will instead ensure that /home/builder/pending exists.
Anybody have any ideas?
There is a simple way – template your group variable.
group_var
ensure_dirs:
- "{{ daemons.builder.dirs.pending }}"
- "{{ processor.prep.logdir }}"
- "{{ shed.logdir }}"
task
- file:
path: "{{ item }}"
state: directory
with_items: "{{ ensure_dirs }}"
I suggest you to create and use a lookup plugin.
Ansible defines lots of lookup plugins, the most popular is 'items' when you use 'with_items'. Convention is 'with_(plugin name)'.
To create you lookup plugin:
Edit file ansible.cfg and uncomment key 'lookup_plugins' with value './plugins/lookup'
Create a plugin file named 'dirs.py' in './plugins/lookup'
Use it in your playbook:
- file:
dest: "{{ item }}"
state: directory
with_dirs: "{{ ensure_dirs }}"
implement you plugin dirs.py with something like that (see lookup plugins for more examples)
class LookupModule(LookupBase):
def run(self, terms, **kwargs):
return [dir.replace('.', '/') for dir in terms]
Advantages:
* Your playbook is more easy to read
* You can create python unitary tests for you plugin and improve it