jinja for loop in salt file.blockreplace for /etc/hosts - ldap

I have some issues with my jinja code inside my salt state, which should change the /etc/hosts file by a LDAP Pillar.
{% set CID = grains['CID'] %}
{% set ldap_pillar = 'ldap-hosts-{{CID}}' %}
ldap-hosts:
file.blockreplace:
- name: /tmp/hosts
- marker_start: "# BEGIN SALT MANAGED CONTENT - DO NOT EDIT BETWEEN THIS - #"
- marker_end: "# END SALT MANAGED CONTENT - DO NOT EDIT BETWEEN THIS - #"
- content:
{% for entry in {{ salt.pillar.get('ldap_pillar') }} %}
{% for hostname, ip in entry.items %}
{{ip}} {{hostname}}
{% endfor %}
{% endfor %}
- show_changes: True
- append_if_not_found: True
The LDAP Pillar serves the following Format:
local:
|_
----------
cn:
host1.domain.tld
ipHostNumber:
4.4.4.4
|_
----------
cn:
host2
ipHostNumber:
8.8.8.8
Now I like to catch all the IPs and Hostnames a build a valid host file.
Here is my Error:
local:
Data failed to compile:
----------
Rendering SLS 'base:ldap_hosts' failed: Jinja syntax error: expected token ':', got '}'; line 10
---
[...]
file.blockreplace:
- name: /tmp/hosts
- marker_start: "# BEGIN SALT MANAGED CONTENT - DO NOT EDIT BETWEEN THIS - #"
- marker_end: "# END SALT MANAGED CONTENT - DO NOT EDIT BETWEEN THIS - #"
- content:
{% for entry in {{ salt.pillar.get('ldap_pillar') }} %} <======================
{% for hostname, ip in entry.items %}
{{ip}} {{hostname}}
{% endfor %}
{% endfor %}
- show_changes: True
[...]
---

I just fiexed it. It was quiet easy.
{% set CID = grains['CID'] %}
{% set ldap_pillar = 'ldap-hosts-'+CID %}
ldap-hosts:
file.blockreplace:
- name: /etc/hosts
- marker_start: "# BEGIN SALT MANAGED CONTENT - DO NOT EDIT BETWEEN THIS - #"
- marker_end: "# END SALT MANAGED CONTENT - DO NOT EDIT BETWEEN THIS - #"
- content: |
{% for entry in salt['pillar.get'](ldap_pillar) -%}
{{entry.ipHostNumber}} {{entry.cn}}
{% endfor %}
- show_changes: True
- append_if_not_found: True
Now everything worked good.

Related

DBT - how can i add model configuration (using a macro on {{this}}) in dbt_project.yml

I want to add node_color to all of my dbt models based on my filename prefix to make it easier to navigate through my dbt documentation :
fact_ => red
base__ => black.
To do so i have a macro that works well :
{% macro get_model_color(model) %}
{% set default_color = 'blue' %}
{% set ns = namespace(model_color=default_color) %}
{% set dict_patterns = {"base__[a-z0-9_]+" : "black", "ref_[a-z0-9_]+" : "yellow", "fact_[a-z0-9_]+" : "red"} %}
{% set re = modules.re %}
{% for pattern, color in dict_patterns.items() %}
{% set is_match = re.match(pattern, model.identifier, re.IGNORECASE) %}
{% if is_match %}
{% set ns.model_color = color %}
{% endif %}
{% endfor %}
{{ return({'node_color': ns.model_color}) }}
{% endmacro %}
And i call it in my model .sql :
{{config(
materialized = 'table',
tags=['daily'],
docs=get_model_color(this),
)}}
This works well but force me to add this line of code in all my models (and in all the new ones).
Is there a way i can define it in my dbt_project.yml to make it available to all my models automatically?
I have tried many things like the config jinja function or this kind of code in dbt_project.yml
+docs:
node_color: "{{ get_model_color(this) }}"
returning Could not render {{ get_model_color(this) }}: 'get_model_color' is undefined
But nothing seems to work
Any idea? Thanks

DBT run model only once

I've created a model to generate a calendar dimension which I only want to run when I explicitly specify to run it.
I tried to use incremental materialisation with nothing in is_incremental() block hoping dbt would do nothing if there was no query to satisfy the temporary view. Unfortunately this didn't work.
Any suggestion or thoughts for how I might achieve this greatly appreciated.
Regards,
Ashley
I've used a tag for this. Let's call this kind of thing a "static" model. In your model:
{{ config(tags=['static']) }}
and then in your production job:
dbt run --exclude tag:static
This doesn't quite achieve what you want, since you have to add the selector at the command line. But it's simple and self-documenting, which is nice.
I think you should be able to hack the incremental materialization to do this. dbt will complain about empty models, but you should be able to return a query with zero records. It'll depend on your RDBMS if this is really much better/faster/cheaper than just running the model, since dbt will still execute a query with the complex merge logic.
{{ config(materialized='incremental') }}
{% if is_incremental() %}
select * from {{ this }} limit 0
{% else %}
-- your model here, e.g.
{{ dbt_utils.date_spine( ... ) }}
{% endif %}
Your last/best option is probably to create a custom materialization that checks for an existing relation and no-ops if it finds one. You could borrow most of the code from the incremental materialization to do this. (You would add this as a macro in your project). Haven't tested this, but to give you an idea:
-- macros/static_materialization.sql
{% materialization static, default -%}
-- relations
{%- set existing_relation = load_cached_relation(this) -%}
{%- set target_relation = this.incorporate(type='table') -%}
{%- set temp_relation = make_temp_relation(target_relation)-%}
{%- set intermediate_relation = make_intermediate_relation(target_relation)-%}
{%- set backup_relation_type = 'table' if existing_relation is none else existing_relation.type -%}
{%- set backup_relation = make_backup_relation(target_relation, backup_relation_type) -%}
-- configs
{%- set unique_key = config.get('unique_key') -%}
{%- set full_refresh_mode = (should_full_refresh() or existing_relation.is_view) -%}
{%- set on_schema_change = incremental_validate_on_schema_change(config.get('on_schema_change'), default='ignore') -%}
-- the temp_ and backup_ relations should not already exist in the database; get_relation
-- will return None in that case. Otherwise, we get a relation that we can drop
-- later, before we try to use this name for the current operation. This has to happen before
-- BEGIN, in a separate transaction
{%- set preexisting_intermediate_relation = load_cached_relation(intermediate_relation)-%}
{%- set preexisting_backup_relation = load_cached_relation(backup_relation) -%}
-- grab current tables grants config for comparision later on
{% set grant_config = config.get('grants') %}
{{ drop_relation_if_exists(preexisting_intermediate_relation) }}
{{ drop_relation_if_exists(preexisting_backup_relation) }}
{{ run_hooks(pre_hooks, inside_transaction=False) }}
-- `BEGIN` happens here:
{{ run_hooks(pre_hooks, inside_transaction=True) }}
{% set to_drop = [] %}
{% if existing_relation is none %}
{% set build_sql = get_create_table_as_sql(False, target_relation, sql) %}
{% elif full_refresh_mode %}
{% set build_sql = get_create_table_as_sql(False, intermediate_relation, sql) %}
{% set need_swap = true %}
{% else %}
{# ----- only changed the code between these comments ----- #}
{# NO-OP. An incremental materialization would do a merge here #}
{% set build_sql = "select 1" %}
{# ----- only changed the code between these comments ----- #}
{% endif %}
{% call statement("main") %}
{{ build_sql }}
{% endcall %}
{% if need_swap %}
{% do adapter.rename_relation(target_relation, backup_relation) %}
{% do adapter.rename_relation(intermediate_relation, target_relation) %}
{% do to_drop.append(backup_relation) %}
{% endif %}
{% set should_revoke = should_revoke(existing_relation, full_refresh_mode) %}
{% do apply_grants(target_relation, grant_config, should_revoke=should_revoke) %}
{% do persist_docs(target_relation, model) %}
{% if existing_relation is none or existing_relation.is_view or should_full_refresh() %}
{% do create_indexes(target_relation) %}
{% endif %}
{{ run_hooks(post_hooks, inside_transaction=True) }}
-- `COMMIT` happens here
{% do adapter.commit() %}
{% for rel in to_drop %}
{% do adapter.drop_relation(rel) %}
{% endfor %}
{{ run_hooks(post_hooks, inside_transaction=False) }}
{{ return({'relations': [target_relation]}) }}
{%- endmaterialization %}
We are working with dbt run --select MODEL_NAME for each model we want to run. So a dbt run in our environment never executes more then one model. By doing so you never run in a situation where you execute a model by accident.

Replace Filter - How to pass a variable for the string to be replaced?

I am currently working on a symfony 6 project and try to pass a variable into a replace filter from Twig. However that does not work for me.
I tried it like this:
{% if form.vars.data.fileName|default %}
{% set system_type_short = get_env("SYSTEM_TYPE_SHORT") %}
{% set replaces = '/var/www/' ~ system_type_short ~ '.domain.de/public/uploads/images/' %}
{% set url = 'uploads/images/' ~ form.vars.data.fileName|replace({(replaces): ('')}) %}
<img src="{{ asset(url) }}" height="100"><br><br>
{% endif %}
The error I get:
A hash key must be followed by a colon (:). Unexpected token "punctuation" of value "{" ("punctuation" expected with value ":").
Can anyone tell me what I am doing wrong or how to pass a variable into the filter function "replace"?
You cannot use the variable syntax ({{ }}) when already in a twig expression.
So you just have to fix {{system_type_short}} and use string concatenation ~ instead.
You would get:
{% if form.vars.data.fileName|default %}
{% set system_type_short = get_env("SYSTEM_TYPE_SHORT") %}
{% set url = 'uploads/images/' ~ form.vars.data.fileName|replace({('/var/www/' ~ system_type_short ~ '.domain.de/public/uploads/images/'): ''}) %}
<img src="{{ asset(url) }}" height="100"><br><br>
{% endif %}

How to create histogram bins for use in dbt using Jinja template?

I am trying to create histogram bins in dbt using jinja. This is the code I am using.
{% set sql_statement %}
select min(eir) as min_eir, floor((max(eir) - min(eir))/10) + 1 as bin_size from {{ ref('interest_rate_table') }}
{% endset %}
{% set query_result = dbt_utils.get_query_results_as_dict(sql_statement) %}
{% set min_eir = query_result['min_eir'][0] %}
{% set bin_size = query_result['bin_size'][0] %}
{% set eir_bucket = [] %}
{% for i in range(10) %}
{% set eir_bucket = eir_bucket.append(min_eir + i*bin_size) %}
{% endfor %}
{{ log(eir_bucket, info=True) }}
select 1 as num
The above code returns dbt.exceptions.UndefinedMacroException.
Below is the error log.
dbt.exceptions.UndefinedMacroException: Compilation Error in model terms_dist (/my/file/dir)
'bin_size' is undefined. This can happen when calling a macro that does not exist. Check for typos and/or install package dependencies with "dbt deps".
Now, I haven't written the SQL yet. I want to build an array containing the historical bins, that I can use in my code.

Using one YAML definition for the same column as it moves through models

I have some model YAML:
version: 2
models:
- name: my_model
columns:
- name: foo
description: My bestest column
...If I make other models which inherit from this one, is there any way to refer back to this column definition when documentation is generated, or do I need to copy-paste the column definition for each model in which the column appears?
In other words, is there a way of defining a column only once to make edits and updates easier.
Cheers,
Graham
I think there are two ways to do it.
1. using macro
create a file that contains all those reusable descriptions. you could even use params to customize the description.
e.g. doc_library.sql
{% macro column_bestest_doc(col_name) %}
My bestest column {{ col_name }}
{% endmacro %}
then use it in dbt_project.yml
version: 2
models:
- name: my_model
columns:
- name: foo_1
description: {{column_bestest_doc(foo_1)}}
- name: my_model_another
columns:
- name: foo_2
description: {{column_bestest_doc(foo_2)}}
2. using YAML anchor
you could do YAML anchors in dbt_project.yml as in any other yml files.
version: 2
models:
- name: my_model
columns:
- name: &foo
description: My bestest column
- name: my_model_another
columns:
- name: *foo
ref:
https://support.atlassian.com/bitbucket-cloud/docs/yaml-anchors/
https://medium.com/#kinghuang/docker-compose-anchors-aliases-extensions-a1e4105d70bd
We have been thinking about this problem extensively as well... our current solution is to modify the generate_model_yaml macro:
{% macro generate_model_yaml(model_name) %}
{% set model_yaml=[] %}
{% set existing_descriptions = fetch_existing_descriptions(model_name) %}
-- # TO DO: pass model to fetch()
-- if column not blank on current model, use description in returned dict
-- otherwise, use global
-- also extract tests on column anywhere in global scope
{% do model_yaml.append('version: 2') %}
{% do model_yaml.append('') %}
{% do model_yaml.append('models:') %}
{% do model_yaml.append(' - name: ' ~ model_name | lower) %}
{% do model_yaml.append(' description: ""') %}
{% do model_yaml.append(' columns:') %}
{% set relation=ref(model_name) %}
{%- set columns = adapter.get_columns_in_relation(relation) -%}
{% for column in columns %}
{%- set column = column.name | lower -%}
{%- set col_description = existing_descriptions.get(column, '') %}
{% do model_yaml.append(' - name: ' ~ column ) %}
{% do model_yaml.append(' description: "' ~ col_description ~ '"') %}
{% do model_yaml.append('') %}
{% endfor %}
{% if execute %}
{% set joined = model_yaml | join ('\n') %}
{{ log(joined, info=True) }}
{% do return(joined) %}
{% endif %}
{% endmacro %}
And then get the first description found that matches the column, with fetch_existing_descriptions():
{% macro fetch_existing_descriptions(current_model) %}
{% set description_dict = {} %}
{% set current_model_dict = {} %}
{% for node in graph.nodes.values() | selectattr("resource_type", "equalto", "model") %}
{% for col_dict in node.columns.values() %}
{% if node.name == current_model %}
-- Add current model description to seperate dict to overwrite with later
{% set col_description = {col_dict.name: col_dict.description} %}
{% do current_model_dict.update(col_description) %}
{% elif description_dict.get(col_dict.name, '') == '' %}
{% set col_description = {col_dict.name: col_dict.description} %}
{% do description_dict.update(col_description) %}
{% endif %}
{% endfor %}
{% endfor %}
-- Overwrite description_dict with current descriptions
{% do description_dict.update(current_model_dict) %}
{% if var('DEBUG', False) %}
{{ log(tojson(description_dict), info=True) }}
{% else %}
{{ return(description_dict) }}
{% endif %}
{% endmacro %}
Finally, we use a bash script with the path to the model to write/overwrite a yaml file using the modified generate_model_yaml above:
#!/bin/bash
# Generates documentation for dbt models.
# Usage:
# Run from within the gold folder.
# Run with no args for all models. Provide optional relative model path to generate docs for just one model:
# Eg. `$ bash ./scripts/generate_docs.sh [./models/path/to_your_model.sql]`
if [[ -n $1 ]]; then # Build array of one model if optional arg provided
yml="${1%.sql}.yml" # Create model yml filename
touch "$yml" && rm "$yml" || exit 1 # Ensure filepath works by testing creation
array=("$yml")
else # Create array of yml
array=()
while IFS= read -r -d $'\0'; do
if [[ ${REPLY} != *"src"* ]]; then # Only proceed for model yml files (don't contain "src")
if [[ -n $1 ]]; then
# Include only the model yml of the optional arg
if [[ $(basename $yml) == $(basename $REPLY) ]]; then
array+=("$REPLY")
fi
else
array+=("$REPLY")
fi
fi
done < <(find "./models/" -name "*.yml" -print0)
fi
# Create copy model yml with prescribed yml containing docs standard.
for i in "${array[#]}"
do
model=$(basename $i | sed -E 's|[.]yml||g')
generated_yml=$(dbt run-operation generate_model_yaml --args "model_name: $model" | sed '1d')
echo "$generated_yml" > "${i}_copy" # Create non-yml copy file to allow script to complete
done
# Once all copies are created, replace originals
for i in "${array[#]}"
do
cat "${i}_copy" > $i
rm "${i}_copy"
done