Is there an easy way to write a test for a column being positive in dbt?
accepted_values doesn't seem to work for continuous vatiables.
I know you can write queries in ./tests but it looks like an overkill for such a simple thing.
You could use dbt_utils.expression_is_true
version: 2
models:
- name: model_name
tests:
- dbt_utils.expression_is_true:
expression: "col_a > 0"
I think the dbt_utils suggestion is good, the only reasonable alternative I can think of is writing a custom schema test:
https://docs.getdbt.com/docs/guides/writing-custom-schema-tests/
But why bother when you can just use expression_is_true
#jake
Related
Hello Could anyone help me how to simulate this scenario. Example I want to validate these 3 fields on my table "symbol_type", "symbol_subtype", "taker_symbol" and return unique combination/result.
I tried to use this command, however Its not working properly on my test. Not sure if this is the correct syntax to simulate my scenario. Your response is highly appreciated.
Expected Result: These 3 fields should return my unique combination using DBT commands.
I'd recommend to either:
use the generate_surrogate_key (docs) macro in the model, or
use the dbt_utils.unique_combination_of_columns (docs) generic test.
For the first case, you would need to define the following in the model:
select
{{- dbt_utils.generate_surrogate_key(['symbol_type', 'symbol_subtype', 'taker_symbol']) }} as hashed_key_,
(...)
from your_model
This would create a hashed value of the three columns. You could then use a unique test in your YAML file.
For the second case, you would only need to add the generic test in your YAML file as follows:
# your model's YAML file
- name: your_model_name
description: ""
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- symbol_type
- symbol_subtype
- taker_symbol
Both these approaches will let you check whether the combination of the three columns is unique over the whole model's output.
I am trying to setup a singular test in dbt (it’s a test for one specific table - TableA), so I wrote an SQL query which I placed in tests folder. It returns failing rows.
However, when I run dbt test —-select tableA, in case the test passes (no failing records), I get the following error:
14:20:57 Running dbt Constraints
14:20:58 Database error while running on-run-end
14:20:59 Encountered an error:
Compilation Error in operation dbt_constraints-on-run-end-0 (./dbt_project.yml)
'dbt.tableA.graph.compiled.CompiledSingularTestNode object' has no attribute 'test_metadata’
In case the test fails, it returns the failing rows, which is correct behaviour.
I am using dbt_constraints package (v0.3.0), which seems to be causing this problem, specifically this script which runs in the on-run-end hook https://github.com/Snowflake-Labs/dbt_constraints/blob/main/macros/create_constraints.sql
I am guessing I need to add some test metadata to the singular test, but not sure how to do it.
Here is what the test looks like
tests/table_a_test.sql
SELECT *
FROM {{ ref('TableA') }}
WHERE param_1 NOT IN
(SELECT TableB_id
FROM {{ ref('TableB') }}
UNION
SELECT TableC_id
FROM {{ ref('TableC') }}
UNION
SELECT TableD_id
FROM {{ ref('TableD') }}
UNION
SELECT TableE_id
FROM {{ ref ('TableE') }} )
and param_2 is null
Thank you!
This seems to be a bug in that package; I would open an issue in the dbt-constraints repo. There is no documented way to add metadata to a Singular test, but that code assumes that all tests will have test_metadata.name.
I doubt this would work, but what happens if you add a schema.yml file to the tests directory, alongside your singular test? The contents would look like:
version: 2
tests:
- name: table_a_test
sounds like your call should be dbt test —-select table_a_test instead of dbt test —-select tableA. I think, you need to call the test name not the table name, which is already hard coded in the (singular) test. does that work?
Have you tried to run the test with a + sign in front of it? Since you are using ref in the test, you might need to build everything before test.
Is there any easy way to handle huge query param like below. Also I would like to know how can I do run time parameterisation for some values?
http://154.213.196.243:7941/v1/banking/Jumio/callback?callBackType=NetVerifyId&jumioIdScanReference=123abcde-1244-8571-3454-abcd12345567&merchantIdScanReference=66a9ff2e-d8ec-e811-a956-000d3ab3f117&verificationStatus=APPROVED_VERIFIED&idScanStatus=SUCCESS&id+ScanSource=API&idCheckDataPositions=OK&idCheckDocumentValidation=OK&idCheckHologram=OK&idCheckMRZcode=OK&idCheckMicroprint=OK&idCheckSecurityFeatures=OK&idCheckSignature=OK&transactionDate=2018-11-20T20%3A53%3A25.797Z&callbackDate=2018-11-20T20%3A53%3A25.797Z&idType=DRIVING_LICENSE&idCountry=GBR&idScanImage+=https%3A%2F%2Fnetverify.com%2Frecognition%2Fv1%2Fidscan%2F123abcde-1244-8571-3454-abcd12345567%2Ffront&idFirstName=ILARIA&idLastName=FURS&idDob=1976-12-23&idExpiry=2025-12-31&personalNumber=123456789&clientIp=xxx.xxx.xxx.xxx&idAddress=%7B%22country%22%3A%22USA%22%2C%20%22stateCode%22%3A%22US-OH%22%7D&idNumber=P12345&idStatus=TESTER961260SS9DL54&identityVerification=%7B%22similarity%22%3A%22MATCH%22%2C%22validity%22%3Atrue%7D HTTP/1.1
Yes. Read the docs: https://github.com/intuit/karate#param
For example:
* param callBackType = 'NetVerifyId'
and so on. And look at params where you can set all keys up as one single JSON and also do parameterization if needed, there are multiple possibilities: https://github.com/intuit/karate#params
See this example as well: dynamic-params.feature
So I have an table of phone_numbers in Rails 3.2, and I would like to be able to make a query such as the following:
PhoneNumber.where("last_called_at - #{Time.now} > call_spacing * 60")
where last_called_at is a datetime and call_spacing is an integer, representing the number of minutes between calls.
I have tried the above, I have also tried using
PhoneNumber.where("datediff(mi, last_called_at, #{Time.now}) > call_spacing")
If anyone could help me make this work, or even recommend a current, up-to-date rails SQL gem, I would be very grateful.
Edit: So the best way to make this work for me was to instead add a ready_at column to my database, and update that whenever either call_spacing or last_called_at was updated, using the before_save method. This is probably the most semantic way to approach this problem in rails.
You have to use quotes around #{Time.now}. To make your first query work you may use TIME_TO_SEC() function:
PhoneNumber.where("(TIME_TO_SEC('#{Time.now}') - TIME_TO_SEC(last_called_at)) > (call_spacing * 60)")
Here is my way to do this:
PhoneNumber.where("last_called_at > '#{Time.zone.now.utc - (call_spacing * 60)}'")
also take a look at this article to be aware of how to play with timezones:
http://danilenko.org/2012/7/6/rails_timezones/
With Django it is possible to find models using the filter method with keyword-arguments like so:
MyModel.objects.filter(serialNo_gt=10)
giving all models with a serial number greater than 10.
Is it possibly to use a similar query language with sql-alchemy? I know that one can write something like MyModel.seriealNO < 10, but with that the code that uses this construct need to import MyModel and I want to create the keywords/query parameters externally without importing MyModel (for a facade-pattern).
the concept of "<attributename>_<operatorname>=<value>" is not built in to SQLAlchemy's Query, however the effect is very easy to reproduce. Here's a quick example done by the author of Flask: https://github.com/mitsuhiko/sqlalchemy-django-query/blob/master/sqlalchemy_django_query.py