Execute dbt model only if var is not empty list - dbt

I have a dbt incremental model that looks pretty like this:
-- depends_on: {{ref('stg_table')}}
{% set dates_query %}
SELECT DISTINCT date FROM dates_table
{% if is_incremental() %}
WHERE date NOT IN (SELECT DISTINCT date FROM {{this}})
{% endif %}
{% endset %}
{% set dates_res = run_query(dates_query) %}
{% if execute %}
{# Return the first column #}
{% set dates_list = dates_res.columns[0].values() %}
{% else %}
{% set dates_list = [] %}
{% endif %}
{% if dates_list %}
with
{% for date in dates_list %}
prel_{{date | replace('-', '_')}} as (
SELECT smth FROM {{ref('stg_table')}}
WHERE some_date = cast('{{date}}' as date)
),
{% endfor %}
prel AS (
select * from prel_{{dates_list[0] | replace('-', '_')}}
{% for date in dates_list[1:] %}
union all
select * from prel_{{date | replace('-', '_')}}
{% endfor %}
)
SELECT some_transformations FROM prel
{% endif %}
But it fails with error, because it runs following statement in database:
create or replace view model__dbt_tmp
as (
-- depends_on: stg_table
);
So the question is how can I skip the model creation if dates list is empty?
Thanks :)

You need a valid query that has the right columns but returns zero rows. This should work:
{% if dates_list %}
with
{% for date in dates_list %}
prel_{{date | replace('-', '_')}} as (
SELECT smth FROM {{ref('stg_table')}}
WHERE some_date = cast('{{date}}' as date)
),
{% endfor %}
prel AS (
select * from prel_{{dates_list[0] | replace('-', '_')}}
{% for date in dates_list[1:] %}
union all
select * from prel_{{date | replace('-', '_')}}
{% endfor %}
)
{% else %}
prel AS (
SELECT smth FROM {{ref('stg_table')}}
WHERE 1=0
)
{% endif %}
SELECT some_transformations FROM prel
Separately, I would make other simplifications to your code. Jinja has a loop variable inside for loops, and flags called loop.first and loop.last that are only true on the first and last elements of an iterable. So your for loop can become:
prel AS (
{% for date in dates_list %}
{% if not loop.first %}union all{% endif %}
select * from prel_{{date | replace('-', '_')}}
{% endfor %}
)
But really I don't think you need to do all of this work with ctes and unioning. Your RDBMS probably supports the in operator with dates, and/or this could just be a join.

Related

DBT Macro With Parameter IF statement eval

I am trying to create a sql template using a macro with one parameter, the if condition does not evaluate to true when passed TABLE1 or TABLE2
{% macro cloud_test_results_get_standard_columns(modelName) %}
result,
Length,
estimatedLength as estimatedLength,
{% if ‘{{modelName}}’ == ‘TABLE1’ %}
TABL1_COL1,
TABL1_COL1,
TABL1_COL1,
{% elif ‘{{modelName}}’ == ‘TABLE2’ %}
TABL1_COL1,
TABL1_COL1,
TABL1_COL1,
{% else %}
TABL_DEFAULT1,
TABL_DEFAULT2,
TABL_DEFAULT3,
{% endif %}
{% endmacro %}
please disregard, had to use modelName instead of ‘{{modelName}}’ inside if block

DBT set variable using macros

my goal is to get the last 2 dates from the tables and run insert_overwrite to load incremental on a large table. I am trying to set a variable inside the model by calling on the macros I wrote. The SQL query is in BigQuery.
I get an error message.
'None' has no attribute 'table'
inside model
{% set dates = get_last_two_dates('window_start',source('raw.event','tmp')) %}
macros
{% macro get_last_two_dates(target_column_name, target_table = this) %}
{% set query %}
select string_agg(format('%T',target_date),',') target_date_string
from (
SELECT distinct date({{ target_column_name }}) target_date
FROM {{ target_table }}
order by 1 desc
LIMIT 2
) a
{% endset %}
{% set max_value = run_query(query).columns[0][0] %}
{% do return(max_value) %}
{% endmacro %}
Thanks in advance. let me know if you have any other questions.
You probably need to wrap {% set max_value ... %} with an {% if execute %} block:
{% macro get_last_two_dates(target_column_name, target_table = this) %}
{% set query %}
select string_agg(format('%T',target_date),',') target_date_string
from (
SELECT distinct date({{ target_column_name }}) target_date
FROM {{ target_table }}
order by 1 desc
LIMIT 2
) a
{% endset %}
{% if execute %}
{% set max_value = run_query(query).columns[0][0] %}
{% else %}
{% set max_value = "" %}
{% endif %}
{% do return(max_value) %}
{% endmacro %}
The reason for this is that your macro actually gets run twice -- once when dbt is scanning all of the models to build the DAG, and a second time when the model is actually run. execute is only true for this second pass.

Macro to surface models to other schemas - dbt_utils.star()

Problem
Currently in my CI process, I am surfacing specific models built to multiple schemas. This is generally my current process.
macros/surface_models.sql
{% set model_views = [] %}
{% for node in graph.nodes.values() %}
{% if some type of filtering criteria %}
{%- do model_tables.append( graph.node.alias ) -%}
{% endif %}
{% endfor %}
{% for view in model_views %}
{% set query %}
'create view my_other_schema.' ~ table ~ 'as (select * from initial_schema.' ~ table ~ ');'
{% endset %}
{{ run_query(query) }}
{% endfor %}
while this works, if the underlying table/view's definition changes, the view created from the above macro will return an error like: QUERY EXPECTED X COLUMNS BUT GOT Y
I could fix this by writing each query with each query's explicit names:
select id, updated_at from table
not
select * from table
Question
Is there a way to utilize the above macro concept but using {{ dbt_utils.star() }} instead of *?

How to indicate "New" sign for fresh product on liquid

I'm using liquid on Shopify. And new to liquid.
Now, I'm trying to indicate "new" sign for products within 7 days.
{% assign today_date = 'now' | date: '%s' %}
{% assign create_date = product.created_at | date: '%s' %}
{% assign dif = today_date - create_date %}
<div>Time diff is {{ dif }}</div>
{% if dif < 30000 %}
<div>New</div>
{% endif %}
It shows errors like this.
Time diff is 1607714358
Liquid error: comparison of String with 30000 failed
What should I do?
Thanks in advance.
All liquid operations (+,-,/,*/%) are done via filters.
So this one here {% assign dif = today_date - create_date %} is incorrect.
It should be like so {% assign dif = today_date | minus: create_date %}.
This is your only mistake in the code.
Final code should be:
{% assign today_date = 'now' | date: '%s' %}
{% assign create_date = product.created_at | date: '%s' %}
{% assign dif = today_date | minus: create_date %}
<div>Time diff is {{ dif }}</div>
{% if dif < 30000 %}
<div>New</div>
{% endif %}

How can filter orders made after a certain date in liquid/ Shopify?

{% for orders in checkout.customer.orders %}
//count the orders
{% endfor %}
I need to count orders made only after a certain date?
How can I do this in Liquid / Shopify?
All Orders have a created_at date that you can output in various formats using the Liquid date filter — you would loop thru the Orders as above and compare that with whatever the "threshold date" in question is, using unix-format dates for comparison:
{% assign ordersThresholdUnix = '2019-01-01' | date: '%s' %}
{% assign ordersCount = 0 %}
{% for orders in checkout.customer.orders %}
{% assign orderDateUnix = order.created_at | date: '%s' %}
{% if orderDateUnix > ordersThresholdUnix %}
{% assign ordersCount = 0 %}
{% endif %}
{% endfor %}
You can then output {{ ordersCount }}.
Note: that I don't think Shopify will allow you to paginate further back than 50 Orders.