I want to replicate a simple case-when statement with a jinja block in dbt.
How can I achieve this?
Here is my statement:
CASE status
WHEN 0 THEN 'pending'
WHEN 1 THEN 'ordered'
WHEN 2 THEN 'shipped'
WHEN 3 THEN 'received'
WHEN 4 THEN 'delivered'
ELSE NULL
END as status_mapping
You have a couple options. First, you can define the mappings in an array or dict (if the ids are not a sequence) and loop through it to produce the full case statement:
{% set statuses = ['pending', 'ordered', 'shipped', 'received', 'delivered'] %}
CASE STATUS
{% for status in statuses %}
WHEN {{ loop.index - 1 }} THEN '{{ status }}'
{% endfor %}
ELSE NULL END STATUS_MAPPING
The other option is to put the mappings into a CSV, load it as a seed data file in DBT (https://docs.getdbt.com/docs/build/seeds), then join with the seed data as a ref.
Create a file called status_mappings.csv:
status_code,status
0,pending
1,ordered
2,shipped
3,received
4,delivered
Run dbt seed, then add
WITH STATUS_MAPPINGS AS (
SELECT * FROM {{ ref('status_mappings') }}
}
SELECT S.STATUS
FROM MY_TABLE T1
JOIN STATUS_MAPPINGS SM ON T1.STATUS_CODE = SM.STATUS_CODE
You can use a macro to insert reusable SQL snippets across different queries, which is one possible reason you might want to do this.
You could define the macro as follows:
-- yourproject/macros/status_mapping.sql
{% macro status_mapping(status) %}
CASE {{ status }}
WHEN 0 THEN 'pending'
WHEN 1 THEN 'ordered'
WHEN 2 THEN 'shipped'
WHEN 3 THEN 'received'
WHEN 4 THEN 'delivered'
ELSE NULL
END
{% endmacro %}
(I have kept the definition flexible)
... and call it in a model e.g. as follows:
-- yourproject/models/base/base__orders.sql
SELECT
order_id,
status_code,
{{ status_mapping('status_code') }} AS status
FROM
{{ source('your_dataset', 'orders') }}
Note the use of quotes around the field name, same as with the built-in source macro two lines below. By including the field name as a macro argument instead of hard-coding it (and keeping the aliasing AS status outside the macro) you allow yourself flexibility to change things in future.
This would then be compiled when you run DBT to something like:
SELECT
order_id,
status_code,
CASE status_code
WHEN 0 THEN 'pending'
WHEN 1 THEN 'ordered'
WHEN 2 THEN 'shipped'
WHEN 3 THEN 'received'
WHEN 4 THEN 'delivered'
ELSE NULL
END AS status
FROM
your_dataset.orders
Related
I have a piece of my Oracle Query i need to optimize a little
select
case
when SUM(dnl.quantity) = line.quantity then 1
when SUM(dnl.quantity) < line.quantity then 0
when SUM(dnl.quantity) > line.quantity then 2
end
from mytable dnl
line.quantity comes out from other part o query, for this example is not needed i think. I would like to calculate only once SUM(dnl.quantity) instead ad every iteraciotn, somethink like
select
case SUM(dnl.quantity)
when line.quantity then 1
when < line.quantity then 0
when > line.quantity then 2
end
from mytable dnl
But obviously this give error at the second and 3rd WHEN
You are over-optimizing. The Oracle compiler can decide how many times it wants to evaluate sum(dnl.quantity). However, the data movement is usually much more expensive than the calculation of aggregations on a single column.
That said if you are really concerned about this, you can use sign():
(case sign(sum(dnl.quantity) - line_quantity)
when 0 then 1
when -1 then 0
when 1 then 2
end)
Or to be more inscrutible:
sign(sum(dnl.quantity) - line_quantity) + 1
I have the following code that uses a right join to connect my data from Table 1 to Table 2. DBT compiled the code successfully without errors but I'm not getting the columns I need...
{{
config(
materialized='incremental'
)
}}
with incremental_salesorder as (
select * from {{ source('db_warehouse', 'sale_order_line') }}
),
final as (
select
distinct incremental_salesorder.product_code_cust,
incremental_salesorder.order_id as id,
incremental_salesorder.create_date,
incremental_salesorder.name as product_name,
incremental_salesorder.product_name_cust,
sale_order.name as sale_order_ref
from incremental_salesorder
right join {{ source('db_warehouse', 'sale_order')}} using (id)
ORDER BY incremental_salesorder.create_date
)
{% if is_incremental() %}
where incremental_salesorder.create_date >= (select max(create_date) from {{ this }} )
{% endif %}
select * from final
incremental_salesorder.order_id and incremental_salesorder.name are not in the results after the code compiled successfully
What am I doing wrong here... ?
Rookie mistake:
Ensure that the defined model name is the same:
models:
dbt_test:
# Applies to all files under models/example/
example:
materialized: view
+schema: staging
+enabled: false
sales_order_unique_incremental: <- this line must match the folder name
materialized: table
+schema: datastudio
I completely missed the warning. Once this was corrected I was able to compile the query and got the results I needed. In case anyone needs an example of how to do a join, this is a working method :)
I have a table with a certain flag called FL_virtual, if this flag equals 1 i need to get my stock in a special way using a function
now i want to make a select statement with it but depending on this flag i need to adjust my select to use a certain function instead of a subquery
so presume i start with this select statement
select product_name,..(other options from the product table),
(select sum(qy_stock) from STOCK where warehouse_id = 1) as 'qy_stock_internal',
(select sum(qy_stock) from STOCK where warehouse_id = 2) as qy_stock_external
From product
now i need to change the subquery (qy_stock) with a call to a function when the fl_virtual flag is 1
so that it becomes like this
select product_name,..(other options from the product table),
FN_GET_stock_PRODUCT(1) as qy_stock_internal,
FN_GET_stock_PRODUCT(2) as qy_stock_external
from product
so i thought a simple if then else structure will do but for some reason i can't get it to work
this is how i thought it would look
select product_name,..(other options from the product table),
IF fl_virtual > 0 THEN
(select sum(qy_stock) from STOCK where warehouse_id = 1) as 'qy_stock_internal',
(select sum(qy_stock) from STOCK where warehouse_id = 2) as qy_stock_external
ELSE
FN_GET_stock_PRODUCT(1) as qy_stock_internal,
FN_GET_stock_PRODUCT(2) as qy_stock_external
END IF
but it doesn't work , anyone got an idea?
You're close - just use CASE instead of IF (you have to repeat your condition, since you cannot easily return two columns from a single CASE (see P.S.):
select product_name,
..(other options from the product table),
(CASE
WHEN fl_virtual > 0 THEN
(select sum(qy_stock) from STOCK where warehouse_id = 1)
ELSE
FN_GET_stock_PRODUCT(1)
END) as qy_stock_internal,
(CASE
WHEN fl_virtual > 0 THEN
(select sum(qy_stock) from STOCK where warehouse_id = 2)
ELSE
FN_GET_stock_PRODUCT(2)
END) as qy_stock_external
P.S.: It is possible to return multiple values from a single CASE, e.g. using Object Types, but that's stuff for a different question :-)
Best way is to use a union select with those 2 different queries and contitions.
Btw. the names stock and warehouse look like a typical school work examin.
So I have written a postgreSQL function that is supposed to do a search on a table based on a huge amount of optional input parameters which i group with lots of AND statements. This one however:
AND
(
(newcheck IS NULL)
OR
(
newcheck IS NOT NULL AND product.id IN(
CASE WHEN newcheck='New'
THEN
(SELECT product.id FROM product WHERE product.anew IS true)
ELSE
(SELECT product.id from product WHERE product.anew IS false)
END)
)
)
gives me a
ERROR: more than one row returned by a subquery used as an expression
This isnt helping much since I do want it to return a lot more than one row.
The values of the newcheck variable will be sent from a dropdown menu in a web form so it can only be 'New' or 'Old'.
Any ideas on what might be causing this problem?
Try something like:
AND ((newcheck IS NULL)
OR (newcheck IS NOT NULL
AND product.id IN (SELECT product.id
FROM product
WHERE product.anew = CASE WHEN newcheck='New'
THEN true
ELSE false
END))
Trying to automatically produce a piechart in access which will show me 3 values (completed, pending and queued)
However in my status field I have several values for pending (sent, monitoringStage1, monitoringStage2, finalApproval)
My query at present gives me the count of each item individually:
SELECT Status, Count(*) AS [Count]
FROM StatusTable
GROUP BY Status
ORDER BY Status;
How would I edit it to count sent, monitoringStage1, monitoringStage2, and finalApproval as one item called pending?
Also on a side note, does anybody know how to put a line at a certain point on a piechart by percentage? So in the piechart created I could have a line to indicate the target number of completed items to compare against current progress.
You can't use CASE in a query, only in VBA.
This will take care of it:
SELECT qryTestx.ThisStatus, Count(qryTestx.Status) AS CountOfStatus
FROM (SELECT (IIf([StatusTable.Status] in ("sent","monitoringStage1","monitoringStage2","finalApproval"),"Pending",[StatusTable.Status])) AS ThisStatus, StatusTable.Status
FROM StatusTable) qryTestx
GROUP BY qryTestx.ThisStatus;
Try this and don't forget to vote ;)
select
count(*),
case
when Status in ( 'sent', 'monitoringStage1', 'monitoringStage2', 'finalApproval') then 'pending'
else status
end as status
from StatusTable
group by
case
when Status in ( 'sent', 'monitoringStage1', 'monitoringStage2', 'finalApproval') then 'pending'
else status
end