Anyway to apply column quoting to all columns in a model - dbt

I am using DBT with snowflake as my target and the table and column names are Pascal Cased. I've noticed that in both persist_docs or when adding tests you need to add the quote: true attribute to every column like in the model example below.
if the quote: true is not provided or the name isn't encased in single then double quotes
(ex. ColumnName) then DBT doesn't quote the column when creating the sql for test or the alter column for comment statements.
Is there any setting that can be done at the model level or above that will quote all columns by default or do I just have to set quote: true for all columns?
models:
- name: tablename
description: '....'
columns:
- name: ColumnName
quote: true
description: '{{ doc("ColumnName") }}'

I am having the same demand but can not fund a solution yet. my workaround is defined a macro to call dbt_utils.star then do this:
{% for col in dbt_utils.star(table_relation, exclued_cols) %}
{% do columns_list.append('"' ~ col ~ '"') %}
{% endfor %}
return (columns_list)
you can call that macro to put the quote on all your column.

Related

How to update a text field with broken JSON literals in PostgreSQL?

I have a lot of character varying records in this format: {'address': 'New Mexico'}.
I would like to update all those columns to have it like this: New Mexico.
I've been investigating how to do it, and it could be with regexp, but I don't know how to make for all columns in a table, and I never used regex in PostgreSQL before.
I have an idea that is something like this:
SET location = regexp_replace(field, 'match pattern', 'replace string', 'g')
Valid JSON literals require double-quotes where your sample displays single quotes. Maybe you can fix that upstream?
To repair (assuming there are no other, unrelated single-quotes involved):
UPDATE web_scraping.iws_informacion_web_scraping
SET iws_localizacion = replace(iws_localizacion, '''', '"')::json ->> 'address'
WHERE iws_id = 3678
AND iws_localizacion IS DISTINCT FROM replace(iws_localizacion, '''', '"')::json ->> 'address';
The 2nd WHERE clause prevents updates to rows that wouldn't change. See:
How do I (or can I) SELECT DISTINCT on multiple columns?
Optional if such cases can be excluded.

SQL: Extract from messy JSON nested field with backslashes

I have a table that has some rows with normal JSON and some with escaped values in the JSON field (backslashes)
id
obj
1
{"is_from_shopping_bag":true,"products":[{"price":{"amount":"18.00","currency":"USD","offset":100,"amount_with_offset":"1800"},"product_id":"1234","quantity":1}],"source":"cart"}
2
{"is_from_shopping_bag":"","products":"[{\ "product_id\ ":\ "2345\ ",\ "price\ ":{\ "currency\ ":\ "USD\ ",\ "amount\ ":\ "140.00\ ",\ "offset\ ":100},\ "quantity\ ":1}]"}
(Note: I needed to include a space after the backslashes in the above table so that they would show up in the github generated markdown table -- my actual table does not include those spaces between the backslash and the quote character)
I am doing a sql query in Hive to get the 'currency' field.
Currently I can run
SELECT
id,
JSON_EXTRACT(obj, '$.products[0].price.currency')
FROM my_table
Which will give me the correct output for the first row, but gives me a NULL in the second row
id
obj
1
"USD"
2
NULL
What is the best way to get currency field from the second row? Is there a way to clean up the field and remove the backslashes before trying to JSON_EXTRACT the relevant data?
I could use REPLACE to swap the '\ ' for '', but is that the most efficient method?
Replace \" with " using regexp_replace like this:
regexp_replace(obj,'\\\\"','"')

Filter centered column SQL Oracle

I try to create a query from an Oracle DB.
that is, SELECT FROM and WHERE.
the column "ORG" is centered and always has 4 letters. I would like to filter that on one specific Item/ value.
I already have WHERE ORG = 'HHAH'
or with SBSTRG (ORG ...:
somehow nothing works.
Does somebody has any idea?
I have values of ' HHAH ' instead of 'HHAH' in the column. There are blanks befor and after the value
You could remove the leading and trailing spaces with the trim() function:
WHERE TRIM(ORG) = 'HHAH'
Using a function on the column value will prevent any index on that column being used (as will like with a leading wildcard); unless you add a function-based index for the trimmed value there isn't much you can do about that.
Do you need the LIKE operator:
WHERE ORG LIKE '%HHAH%'
or
WHERE ORG LIKE '%' || 'HHAH' || '%'
to search for values conatining 'HHAH'?
I would recommend fixing the data:
update t
set org = trim(org);
I see no reason to be storing spaces in the name of an org. If you need spaces for reporting purposes, put them there.

Can I use Regular Expressions in USQL?

Is it possible to write regular expression comparisons in USQL?
For example, rather than multiple "LIKE" statements to search for the name of various food items, I want to perform a comparison of multiple items using a single Regex expression.
You can create a new Regex object inline and then use the IsMatch() method.
The example below returns "Y" if the Offer_Desc column contains the word "bacon", "croissant", or "panini".
#output =
SELECT
, CSHARP(new Regex("\\b(BACON|CROISSANT|PANINI)S?\\b"
)).IsMatch(wrk.Offer_Desc.ToUpper())
? "Y"
: "N" AS Is_Food
FROM ... AS wrk
Notes:
The CSHARP() block is optional, but you do need to escape any backslashes in your regex by doubling them (as in the example above).
The regex sample accepts these as a single words, either in singular or plural form ("paninis" is okay but "baconator" is not).
I'd assume it would be the same inline, but when I used regex in code behind I hit some show-stopping speed issues.
If you are checking a reasonable number of food items I'd really recommend just using an inline ternary statement to get the results you're looking for.
#output =
SELECT
wrk.Offer_Desc.ToLowerInvariant() == "bacon" ||
wrk.Offer_Desc.ToLowerInvariant() == "croissant" ||
wrk.Offer_Desc.ToLowerInvariant() == "panini" ? "Y" : "N" AS Is_Food
FROM ... AS wrk
If you do need to check if a string contains a string, the string Contains method might still be a better approach.
#output =
SELECT
wrk.Offer_Desc.ToLowerInvariant().Contains("bacon") ||
wrk.Offer_Desc.ToLowerInvariant().Contains("croissant") ||
wrk.Offer_Desc.ToLowerInvariant().Contains("panini") ? "Y" : "N" AS Is_Food
FROM ... AS wrk

Querying with objects.extra in django

How can i query with to_tsquery for partial words match
For example
records
'hello old world'
'hello world'
'hi welcome'
'hi'
Here i wanted to return all records which includes words 'hello' or 'welcome'
SELECT * FROM accounts_order
WHERE name_tsvector ## to_tsquery('english','hello | welcome');
This returns properly.
Here i tried to implement using django 'objects.extra' query
queryset = Order.objects.extra(where=['name_tsvector ## to_tsquery(%s|%s)'], params=['hello','welcome'])
This query is nor working,got an exception
operator is not unique: unknown | unknown
LINE 1: ...nts_order" WHERE name_tsvector ## to_tsquery(E'olmin'|E'20')
^
HINT: Could not choose a best candidate operator. You might need to add explicit type casts.
How can i pass this params part as a list?
It appears that you want the | within the string, ie a boolean OR in the tsquery:
regress=> select to_tsquery('english', 'olmin|20');
to_tsquery
----------------
'olmin' | '20'
(1 row)
Django is expanding %s to E'string', so you can't write %s|%s; as you've seen that expands to E'string1'|E'string2' which is interpreted as a boolean OR on the two strings. You must either:
Concatenate the two strings and | in Django with (eg) params=['hello'+'|'+'welcome'] and a single (%s) argument; or
Get Pg to concatenate the two strings with a literal |, eg (%s||'|'||%s)
I'd recommend the first option; it requires you to change the parameters you pass from Python but it produces vastly simpler SQL.
The original is invalid, it's trying to perform a boolean OR on two string literals:
regress=> select to_tsquery('english', 'olmin'|'20');
ERROR: operator is not unique: unknown | unknown
LINE 1: select to_tsquery('english', 'olmin'|'20');
^
HINT: Could not choose a best candidate operator. You might need to add explicit type casts.