In ansible how to initialise a variable from another variable? - variables

In an Ansible role, how to define a variable depending on another one?
I am designing a role and want its interface to understand a playbook variable like framework_enable_java = yes or framework_enable_java = mysql tomcat and want to write a vars/main.yml files that defines boolean values
framework_enable_java_core
framework_enable_java_mysql
framework_enable_java_tomcat
according to the content of framework_enable_java. I tried the obvious definitions similar to
framework_enable_java_mysql: 'mysql' in framework_enable_java
and several more or less subtle approaches like
framework_enable_java_mysql: {{ 'mysql' in framework_enable_java }}
or
{% if 'mysql' in framework_enable_java %}
framework_enable_java_mysql: yes
{% else %}
framework_enable_java_mysql: no
{% endif %}
None of them turned out to be working. The similar looking question is unrelated as it is more like implementing variable indirection than variable deduction.
Is it at all possible to write the desired vars/main.yml for my role? How would it look like? If it is not possible, what would be the best way to make these deductions? (e.g. using a task include?)

Answer from the comments:
framework_enable_java_mysql: "{{ 'mysql' in framework_enable_java }}"
Double quotes are essential here because otherwise YAML parser tries to construct an object(dictionary) and not templated variable.

Related

How can I self reference the table I'm working on in a dbt Project?

I'm looking to self-reference the table I'm working on within a model file in the config block to alias the table name. Right now, I'm naming dynamically naming the alias using a Python file for loop but would prefer if the model file recognized and designated the table name in itself.
{{ config(
alias=model.table ### this.name? not sure what syntax to use here ###
) }}
select *
from {{ source('aceso', 'aceso_accountlookup') }}
{% if is_incremental() %}
where _FIVETRAN_SYNCED > (select max(_FIVETRAN_SYNCED) from {{ this }} )
{% endif %}
Currently I have no idea the format of syntax required to get dbt to understand what I want it to do
dbt currently has a strong one-database-object-per-model association, so what you seem to be trying to describe (based on reading your answers to #tconbeer's questions in the comments) isn't really possible without something hacky like you're already doing.
There is a GitHub Discussion around making it possible for dbt to generate multiple objects from a single model here that you may wish to contribute to.

DBT - how to namespace tables generated by different versions of a project without schemas?

Let's say I have a project in dbt. When I run it, it generates a bunch of tables. Now I want to change the underlying SQL and see what happens to these tables, how they differ from before the change. So I want to be able to compare all the tables generated by the old version to all the tables generated by the new version. Ideally I would like the method to work for any number of versions, not just two. Basically the question is how to put each version in its own namespace.
Method 1: run the new version of the project in a new schema, so I can compare old.foo to new.foo. But getting another schema from the database admins is a painful process.
Method 2: Have both versions in the same schema, but add a prefix, like new_ to the table name for the new version. So, old version has table foo, new version has new_foo, and I compare foo to new_foo.
Is there any convenient way to do Method 2 in dbt? Is there a third method I should be considering? Or am I doing something fundamentally wrong to even find myself in this situation? It seems like it shouldn't be such a rare problem but I can't find any information about what I can do in this situation.
One possible way to do this is to override the default alias macro. The macro gets called even if there is no alias defined in the configuration, so you can use that as an opportunity to rename the target table.
The version below will prefix any model that does not have an alias set in the configuration with name of the target profile when the run is not against the prod profile.
{% macro generate_alias_name(custom_alias_name=none, node=none) -%}
{%- if target.name != 'prod' and custom_alias_name is none -%}
{{ target.name ~ "_" ~ node.name }}
{%- elif target.name == 'prod' -%}
{{ node.name }}
{%- else -%}
{{ custom_alias_name | trim }}
{%- endif -%}
{%- endmacro %}
If your model is foo.sql and you run this against a profile named "prod", the table will be foo. If you run it against "dev", it will be dev_foo. If your model has an alias, then the alias name will take precedence regardless of the target profile. You can decide if you want to include the special behavior if the model has an alias name. Just modify the else block.

Using environment-dependent source specifications in DBT

I have a bit of an odd problem where I need to union together multiple source databases in production, but only one in lower environments. I'm using DBT and would like to use the source functionality so I can trace the origin of my data lineage, but I'm not quite sure how to handle this case. Here's the naive non-source approach:
{% set clouds = [1, 2, 3] %} {# this clouds variable will come from the environment, instead of hard coded. In lower envs, it would just be [1] #}
{% for cloudId in clouds %}
select *
from raw_{{ cloudId }}.users
{% if not loop.last %}
union all
{% endif %}
{% endfor %}
This isn't ideal, because I'm referencing my raw_n schema(s) directly. I'd love to have something like this:
version: 2
sources:
{% for cloud in env('CLOUDS') %}
- name: raw_{{ cloud }}
schema: raw_{{ cloud }}
database: raw
tables:
- name: users
identifier: users
{% endfor %}
So I can actually use the source() function in the sql files.
I'm not sure how to make such a configuration possible based on environment. Can this just simply not be done in dbt?
Since source is just a python/jinja function you can pass variables to it. So the following should work:
{% if target.name == `prod` %} {# this clouds variable will come from the environment, instead of hard coded. In lower envs, it would just be [1] #}
{% set clouds = [1, 2, 3] %}
{% else %}
{% set clouds = [1] %}
{% endif %}
{% for cloudId in clouds %}
select *
from {{ source(cloudId, 'users') }}
{% if not loop.last %}
union all
{% endif %}
{% endfor %}
as for the environment part you would have to use env_var function but those are always strings so you would write env_var('my_list').split(',') assuming its comma separated.
EDIT:
Per askers, comments revised solution to include info as to what environment is being used
EDIT #2:
I know we left this off on a rather unhelpful note but now I am having a different issue that suggests a solution that might be more helpful for you.
in dbt_project.yaml you can specify multiple paths to models/tests/seeds etc. you can also specify dynamic paths. So you could potentially modify your models-path to something like this: model-path: ['models','models_{{ target.name }}'] with this you have multiple source.yml models/source.yml will include all sources that don't change between dev/test/prodand then sources that do need to vary will be inmodels_{{ target.name }}`.
The same goes for models that will use them.
I know this isn't dynamic sources file still but it preserves lineages, and you do it in yaml just like you wanted.
Setting context here, I believe your primary interest is in working with the dbt docs / lineage graph for a prod / dev case?
In that case, as you are highlighting, the manifest is generated from the source.yml files within your model directories. So - effectively what you are asking about is the way to "activate" different versions of a single source.yml file based on environment?
Fair warning: dbt core's intentions doesn't align with that use case. So let's explore some alternatives.
If you want to hack something that is dbt-cli / local only, Jeremy lays out that you could approach this via bash/stdout:
The bash-y way
Leveraging the fact that dbt/Jinja can write anything
its heart desires to stdout, pipe the stdout contents somewhere else.
$ dbt run-operation generate_source --args 'schema_name: jaffle_shop'
> models/staging/jaffle_shop/src_jaffle_shop.yml
At least one reason that he points out is that there would be security implications if the dbt jinja compiler was un-sandboxed from the /target destination so I wouldn't expect this to change from dbt-core in the future.
A non-dbt helper tool.
From later in the same comment:
At some point, I'm so on board with the request—I'm just not sure if
dbt Core can be the right tool for the job, or it's a better fit for a
helper tool written in python.
Use git hooks to "switch" the active "source.yml" file in that directory?
This is just an idea that I haven't even looked into because it's somewhat far-fetched but basically use your environment variables to activate pre-run hooks that set source-dev.yml to gitignore in production and vice versa? The files would have to be defined statically so I'm not sure that helps anyway.

In DBT - Setting a custom schema for a seed makes the ref not work

We started using the seeds functionality in DBT, we put a single CSV file in the data folder, and configured the seed to use a custom schema named util --- and it works (i.e. - it creates a table in the correct schema).
yaml looks like this:
seeds:
my_project_name:
+schema: util
However, when we refer to it using ref in our models:
{{ref('my_seed')}}
it looks for it in our default target schema for the environment (public ), instead of the custom one we defined --- how come?
I should mention that we also used the macro trick mentioned here:
https://docs.getdbt.com/docs/building-a-dbt-project/building-models/using-custom-schemas
Update:
Adding the macro code we used (as the file get_custom_schema.sql):
{% macro generate_schema_name(custom_schema_name, node) -%}
{{ generate_schema_name_for_env(custom_schema_name, node) }}
{%- endmacro %}
Not sure the + is needed in front of schema, based on the code example here.
Another option would be to define the schema for the specific seed table:
seeds:
my_project_name:
my_seed:
schema: util

url templatetag with "safe" arguments?

I'm trying to use the {% url %} template tag but with an argument to be substituted out later in Javascript. It looks something like this:
var pid = '7a8b323f-52b1-466c-91d3-b4i4d85b1c32';
var status_url = '{% url quote_status form_urlname inquiry_id instance_id '{0}' %}'.format(pid);
I tried using both {% autoescape off %} and |safe, neither of which seemed to work. Is there a good way to make this happen?
(snip previous answer, sorry, didn't read carefully enough)
If the argument is required to build the url, it just won't work - the templatetag is executed on the server, the javascript is executed on the browser.