Create Multiple Tables in BigQuery Using dbt for-loop - sql

I am trying to create individual tables inside a single dataset in BigQuery using a for-loop in dbt, going through a list of accounts, with no success so far.
A little bit of context - I am using Stitch to fetch data from Facebook Ads and to push it to our BigQuery warehouse. Then, based on the model below, create new separate table for each account with aggregated/modelled data.
The declaration of the variables looks like:
-- table that contains list of accounts
{% set account_data = ref('bq_acct_list') %}
{% set accounts = get_column_values(table=account_data, column='bq_name_suffix') %}
And the query that the tables have to created based on is:
SELECT
DATE_TRUNC(DATE(date_start), DAY) date,
account_id,
account_name,
ROUND(SUM(spend), 2) ad_spend
FROM `{{ target.project }}.{{account}}.ads_insights`
GROUP BY 1, 2, 3
What is missing (I think) is the wrapper of the query + the for-loop itself. Can anyone help me fill in the blanks?

dbt operates under a paradigm of one model (i.e. a .sql file in your models/ directory) is represented by one object (table/view) in your data warehouse — at the moment there's no way around that.
If you need to maintain separate tables per account I'd consider:
Wrapping up the logic into a macro:
-- macros/account_transform.sql
{% macro account_transform(account) %}
SELECT
DATE_TRUNC(DATE(date_start), DAY) date,
account_id,
account_name,
ROUND(SUM(spend), 2) ad_spend
FROM `{{ target.project }}.{{ account }}.ads_insights`
GROUP BY 1, 2, 3
{% endmacro %}
Create a separate model for each account, and call the macro in each model:
-- models/my_first_account.sql
{{ account_transform('my_first_account') }}
-- models/my_second_account.sql
{{ account_transform('my_second_account') }}
Depending on your exact use-case, you might also consider creating a master table for all accounts, by unioning them together. That way, you only have to create one model. Check out the article on "Unioning together identically-structured sources" for some techniques for this approach.

Related

Qlik - Building a dynamic view

I have a SQL query that creates a table, and every month 2 new columns will be added for that table related to the current month.
I have tried without success to set up a flat table (visual) in Qlik that will automatically expand every month to include these table. Is there a way to do this, and i so please point me in the right direction.
You can have a look at CrossTable prefix.
This prefix allows a wide table to be converted to a long table.
So if we have data like this:
After running the following script:
CrossTable:
CrossTable(Month, Sales)
LOAD Item,
[2022-10],
[2022-11],
[2022-12],
[2023-01],
[2023-02],
[2023-03],
[2023-04]
FROM
[C:\Users\User1\Documents\SO_75447715.xlsx]
(ooxml, embedded labels, table is Sheet1);
The final data will looks like below. As you can see there are only 3 columns. All xls month columns (after Item) are now collapsed under one field - Month and all the values are collapsed under Sales column.
Having the data in this format then allows creating "normal" charts with adding Month column as dimension and use sum(Sales) as an expression.
P.S. If you dont want to manage the new columns being added then the script can be:
CrossTable(Month, Sales)
LOAD
Item,
*
FROM
...

Error in joining datasets in Superset - column reference "name" is ambiguous

I have to datasets in my Superset. Let's call one
prod_company and the other one prod_order. I want to maintain two datasets and join them into a third dataset.
It turns out it is "almost" working building on top of this PR and some SQL
SELECT * FROM {{ dataset(38) }}
JOIN {{ dataset(39) }}
ON dataset_38.company_id = dataset_39.company_id
Why almost?
Well, that's because both datasets would join table.company DB table, and both have column table.company.name
So the result is
And this is where I'm stuck right now.
What are the nuances of building on top of an existing dataset and using that power to join cross-databases and more?

How can I assign pre-determined codes (1,2,3, etc,) to a JSON-type column in PostgreSQL?

I'm extracting a table of 2000+ rows which are park details. One of the columns is JSON type. Image of the table
We have about 15 attributes like this and we also have a documentation of pre-determined codes assigned to each attribute.
Each row in the extracted table has a different set of attributes that you can see in the image. Right now, I have cast(parks.services AS text) AS "details" to get all the attributes for a particular park or extract just one of them using the code below:
CASE
WHEN cast(parks.services AS text) LIKE '%uncovered%' THEN '2'
WHEN cast(parks.services AS text) LIKE '%{covered%' THEN '1' END AS "details"
This time around, I need to extract these attributes by assigning them the codes. As an example, let's just say
Park 1 - {covered, handicap_access, elevator} to be {1,3,7}
Park 2 - {uncovered, always_open, handicap_access} to be {2,5,3}
I have thought of using subquery to pre-assign the codes, but I cannot wrap my head around JSON operators - in fact, I don't know how to extract them on 2000+ rows.
It would be helpful if someone could guide me in this topic. Thanks a lot!
You should really think about normalizing your tables. Don't store arrays. You should add a mapping table to map the parks and the attribute codes. This makes everything much easier and more performant.
step-by-step demo:db<>fiddle
SELECT
t.name,
array_agg(c.code ORDER BY elems.index) as codes -- 3
FROM mytable t,
unnest(attributes) WITH ORDINALITY as elems(value, index) -- 1
JOIN codes c ON c.name = elems.value -- 2
GROUP BY t.name
Extract the array elements into one record per element. Add the WITH ORDINALITY to save the original order.
Join your codes on the elements
Create code arrays. To ensure the correct order, you can use the index values created by the WITH ORDINALITY clause.

Laravel, query to show big data is slow

I have a page for user searching data from specific date range and it will show result in datatable e.g (ID, Audit type, user, new value, old value etc.) and it was from 2 table relationship.
Here is my query:
$audits = \OwenIt\Auditing\Models\Audit::with('user')
->orderBy('updated_at', 'DESC')
->where('created_at', '>=', $date1)
->where('created_at', '<=', $date2)
->get;
The problem is if amount of data is big, the process so slow. How to optimize the query?
I've tried to use paginate(10) or take(10), but it only show 10 data not all data.
To enhance performance at database level, create an index on the column which appears in where constraints.
So create an index on the created_at column.
Also to compare dates, why not use the whereDate rather than comparing string literals for dates
$audits = \OwenIt\Auditing\Models\Audit::with('user')
->orderBy('updated_at', 'DESC')
->whereDate('created_at','>=',$date1)
->whereDate('created_at','<=',$date2)
->paginate(25);
Once the query is getting paginated records, the view can then provide the pagination links for the visitors/users to loop through the paginated result sets
{{ $audit->links() }}
The links() will insert pagination link buttons on the view, which users/visitors can click to shuffle through the result sets.
Laravel docs: https://laravel.com/docs/8.x/pagination#displaying-pagination-results
For large datasets/records in database tables, its always wise to query paginated result sets - to reduce memory usage.

Get Filtered Data From HubDB from Multiple Select Column

I have a Table in HubDB as follows...
I have Filter data by Gender by Following Code Snippet
{% for row in hubdb_table_rows(675094, 'gender=1') %}
But, Now i want to filter Data by Multiple Filed. I'm stuck here.
I want data which has M5 value in Multiple Filed.
You can add multiple limits to your query with an &.
I haven't tested it but it should be something like:
{% for row in hubdb_table_rows(675094, 'gender=1&multiple_feed__contains=M5') %}
You can add multiple limits to your query with an &.
I haven't tested it but it should be something like:
{% for row in hubdb_table_rows(675094, 'gender=1&multiple_feed=M5') %}