How do I loop through alll columns using Jinja in DBT? - sql

I want to iterate over all the columns using dbt.

You can use the built-in adapter wrapper and adapter.get_columns_in_relation:
{% for col in adapter.get_columns_in_relation(ref('<<your model>>')) -%}
... {{ col.column }} ...
{% endfor %}

I think the star macro from the dbt-utils package + some for-loop logic might help you here? This depends on the exact use case and warehouse you're using (as pointed out in the comments).
The star macro generates a list of columns in the table provided.
So a possible approach would be something along the lines of:
{% for col in [{{ dbt_utils.star(ref('my_model')) }}] %}
...operation...
{% endfor %}

If you have the model node, and you have columns defined as model properties, this will work:
{% for col in model.columns.values() %}
... {{ col.name }} ... {{ col.data_type }} ...
{% endfor %}
You can get the model node from the graph:
{% set model = graph.nodes.values()
| selectattr("resource_type", "equalto", "model")
| selectattr("name", "equalto", model_name)
| first %}

Related

DBT run model only once

I've created a model to generate a calendar dimension which I only want to run when I explicitly specify to run it.
I tried to use incremental materialisation with nothing in is_incremental() block hoping dbt would do nothing if there was no query to satisfy the temporary view. Unfortunately this didn't work.
Any suggestion or thoughts for how I might achieve this greatly appreciated.
Regards,
Ashley
I've used a tag for this. Let's call this kind of thing a "static" model. In your model:
{{ config(tags=['static']) }}
and then in your production job:
dbt run --exclude tag:static
This doesn't quite achieve what you want, since you have to add the selector at the command line. But it's simple and self-documenting, which is nice.
I think you should be able to hack the incremental materialization to do this. dbt will complain about empty models, but you should be able to return a query with zero records. It'll depend on your RDBMS if this is really much better/faster/cheaper than just running the model, since dbt will still execute a query with the complex merge logic.
{{ config(materialized='incremental') }}
{% if is_incremental() %}
select * from {{ this }} limit 0
{% else %}
-- your model here, e.g.
{{ dbt_utils.date_spine( ... ) }}
{% endif %}
Your last/best option is probably to create a custom materialization that checks for an existing relation and no-ops if it finds one. You could borrow most of the code from the incremental materialization to do this. (You would add this as a macro in your project). Haven't tested this, but to give you an idea:
-- macros/static_materialization.sql
{% materialization static, default -%}
-- relations
{%- set existing_relation = load_cached_relation(this) -%}
{%- set target_relation = this.incorporate(type='table') -%}
{%- set temp_relation = make_temp_relation(target_relation)-%}
{%- set intermediate_relation = make_intermediate_relation(target_relation)-%}
{%- set backup_relation_type = 'table' if existing_relation is none else existing_relation.type -%}
{%- set backup_relation = make_backup_relation(target_relation, backup_relation_type) -%}
-- configs
{%- set unique_key = config.get('unique_key') -%}
{%- set full_refresh_mode = (should_full_refresh() or existing_relation.is_view) -%}
{%- set on_schema_change = incremental_validate_on_schema_change(config.get('on_schema_change'), default='ignore') -%}
-- the temp_ and backup_ relations should not already exist in the database; get_relation
-- will return None in that case. Otherwise, we get a relation that we can drop
-- later, before we try to use this name for the current operation. This has to happen before
-- BEGIN, in a separate transaction
{%- set preexisting_intermediate_relation = load_cached_relation(intermediate_relation)-%}
{%- set preexisting_backup_relation = load_cached_relation(backup_relation) -%}
-- grab current tables grants config for comparision later on
{% set grant_config = config.get('grants') %}
{{ drop_relation_if_exists(preexisting_intermediate_relation) }}
{{ drop_relation_if_exists(preexisting_backup_relation) }}
{{ run_hooks(pre_hooks, inside_transaction=False) }}
-- `BEGIN` happens here:
{{ run_hooks(pre_hooks, inside_transaction=True) }}
{% set to_drop = [] %}
{% if existing_relation is none %}
{% set build_sql = get_create_table_as_sql(False, target_relation, sql) %}
{% elif full_refresh_mode %}
{% set build_sql = get_create_table_as_sql(False, intermediate_relation, sql) %}
{% set need_swap = true %}
{% else %}
{# ----- only changed the code between these comments ----- #}
{# NO-OP. An incremental materialization would do a merge here #}
{% set build_sql = "select 1" %}
{# ----- only changed the code between these comments ----- #}
{% endif %}
{% call statement("main") %}
{{ build_sql }}
{% endcall %}
{% if need_swap %}
{% do adapter.rename_relation(target_relation, backup_relation) %}
{% do adapter.rename_relation(intermediate_relation, target_relation) %}
{% do to_drop.append(backup_relation) %}
{% endif %}
{% set should_revoke = should_revoke(existing_relation, full_refresh_mode) %}
{% do apply_grants(target_relation, grant_config, should_revoke=should_revoke) %}
{% do persist_docs(target_relation, model) %}
{% if existing_relation is none or existing_relation.is_view or should_full_refresh() %}
{% do create_indexes(target_relation) %}
{% endif %}
{{ run_hooks(post_hooks, inside_transaction=True) }}
-- `COMMIT` happens here
{% do adapter.commit() %}
{% for rel in to_drop %}
{% do adapter.drop_relation(rel) %}
{% endfor %}
{{ run_hooks(post_hooks, inside_transaction=False) }}
{{ return({'relations': [target_relation]}) }}
{%- endmaterialization %}
We are working with dbt run --select MODEL_NAME for each model we want to run. So a dbt run in our environment never executes more then one model. By doing so you never run in a situation where you execute a model by accident.

Django template syntax error: could not parse remainder % 2

I am getting a TemplateSyntaxError: "could not parse remainder % 2 from num%2":
{% if num%2 ==0 %}
{{"Even"}}
{% else %}
{{"Odd"}}
{% endif %}
You can't use arbitrary Python expressions in Django templates. You should create a custom filter for them.
However, for your expression there is a built-in tag divisibleby. From its example:
{{ value|divisibleby:"2" }}
If value is 4, the output would be True. So the final answer looks like (untested):
{% if num|divisibleby:"2" %}
Even
{% else %}
Odd
{% endif %}

Macro to surface models to other schemas - dbt_utils.star()

Problem
Currently in my CI process, I am surfacing specific models built to multiple schemas. This is generally my current process.
macros/surface_models.sql
{% set model_views = [] %}
{% for node in graph.nodes.values() %}
{% if some type of filtering criteria %}
{%- do model_tables.append( graph.node.alias ) -%}
{% endif %}
{% endfor %}
{% for view in model_views %}
{% set query %}
'create view my_other_schema.' ~ table ~ 'as (select * from initial_schema.' ~ table ~ ');'
{% endset %}
{{ run_query(query) }}
{% endfor %}
while this works, if the underlying table/view's definition changes, the view created from the above macro will return an error like: QUERY EXPECTED X COLUMNS BUT GOT Y
I could fix this by writing each query with each query's explicit names:
select id, updated_at from table
not
select * from table
Question
Is there a way to utilize the above macro concept but using {{ dbt_utils.star() }} instead of *?

Concatenate columns using a macro in DBT for Redshift

I want to concatenate a few columns using column1 ^^ column2 ^^ ... syntax in DBT for Redshift. If there are NULL values in the columns ## should be used, resulting in f.e. ## ^^ ##. I have found the following macro for concatenation:
{% macro safe_concat(field_list) %}
{# Takes an input list and generates a concat() statement with each argument in the list safe_casted to a string and wrapped in an ifnull() #}
concat({% for f in field_list %}
ifnull(safe_cast({{ f }} as string), '##')
{% if not loop.last %}, {% endif %}
{% endfor %})
{% endmacro %}
When I use it in my select statement:
select
{{ safe_concat([street, city]) }} as address_key
from source
I get the following error. Is this related to the code I am using?
Database Error in model address (models/address.sql)
syntax error at or near "as"
LINE 32: ifnull(safe_cast( as string), '##')
Try wrapping your column names in quotes when you call them in the macro - I think it’s trying to pass in the variables street and city (because you’re already inside of curly braces), which don’t exist so are evaluating to None
you can try pushing every loop into an array and then you can use evaluated strings.and also for concat func. you can use '~' this.
{% set query_results = [] %}
{% for f in field_list %}
{% set x = ifnull(safe_cast({{ f }} as string), '##') ~ '^^' %}
{% if not loop.last %}, {% endif %}
{% set query_results = query_results.append(x) %}
{% endfor %}
...
return{{query_results }}

how to split the string in django template?

i am trying to split the string in template using custom template filter. But i got an error
TemplateSyntaxError at /job/16/
'for' statements should use the format 'for x in y': for skill in form.instance.skills | split : ","
Here it is my filter
#register.filter(name='split')
def split(value, key):
"""
Returns the value turned into a list.
"""
return value.split(key)
this is my template
<h4>Skills</h4>
{% for skill in form.instance.skills | split : "," %}
{{ skill }}
{% endfor %}
Thanks
Split is a custom filter, don't forget to create your filter, and to load it in your HTML page.
Documentation for Django 4.0: https://docs.djangoproject.com/en/4.0/howto/custom-template-tags/
<h4>Skills</h4>
{% with form.instance.skills|split:"," as skills %}
{% for skill in skills %}
{{ skill }}<br>
{% endfor %}
{% endwith %}
For extract character string, use filter cut:
Phone
this removes the scripts from the string.
The direct for loop works too, you just have to remove the spaces in the syntax:
<h4>Skills</h4>
{% for skill in form.instance.skills|split:"," %}
{{ skill }}
{% endfor %}