Snowflake Create View with JSON (VARIANT) field as columns with dynamic keys - sql

I am having a problem creating VIEWS with Snowflake that has VARIANT field which stores JSON data whose keys are dynamic and keys definition is stored in another table. So I want to create a VIEW that has dynamic columns based on the foreign key.
Here are my table looks like:
companies:
| id | name |
| -- | ---- |
| 1 | Company 1 |
| 2 | Company 2 |
invoices:
| id | invoice_number | custom_fields | company_id |
| -- | -------------- | ------------- | ---------- |
| 1 | INV-01 | {"1": "Joe", "3": true, "5": "2020-12-12"} | 1 |
| 2 | INV-01 | {"2":"Hello", "4": 1000} | 2 |
customization_fields:
| id | label | data_type | company_id |
| -- | ----- | --------- | ---------- |
| 1 | manager | text | 1 |
| 2 | reference | text | 2 |
| 3 | emailed | boolean | 1 |
| 4 | account | integer | 2 |
| 5 | due_date | date | 1 |
So I want to create a view for getting each companies invoices something like:
CREATE OR REPLACE VIEW companies_invoices AS SELECT * FROM invoices WHERE company_id = 1
which should get a result like below:
| id | invoice_number | company_id | manager | emailed | due_date |
| -- | -------------- | ---------- | ------- | ------- | -------- |
| 1 | INV-01 | 1 | Joe | true | 2020-12-12 |
So my challenge above here is I cannot make sure the keys when I write the query. If I know that I could write
SELECT
id,
invoice_number,
company_id,
custom_fields:"1" AS manager,
custom_fields:"3" AS emailed,
custom_fields:"5" AS due_date
FROM invoices
WHERE company_id = 1
These keys and labels are written in the customization_fields table, so I tried different ways and I am not able to do that.
So could anyone tell me if we can do or not? If we can please give me an example so it would really help.

You cannot do what you want to do with a view. A view has a fixed set of columns and they have specific types. Retrieving a dynamic set of columns requires some other mechanism.

If you're trying to change the number of columns or the names of the columns based on the rows in the customization_fields table, you can't do it in a view.
If you have a defined schema and just need to grab dynamic JSON properties, you may want to consider looking into Snowflake's GET function. It allows you to get any part of a JSON using a string for the path rather than using a literal path in the SQL statement. For example:
create temp table foo(v variant);
insert into foo select parse_json('{ "name":"John", "age":30, "car":null }');
-- This uses a literal path in the SQL to get to a JSON property
select v:name::string as first_name from foo;
-- This uses the GET function to get the value from a path in a string
select get(v, 'name')::string as first_name from foo;
You can replace the 'name' in the second parameter of the GET function with the value stored in the customization_fields table.

In SF, You will have to use a Stored Proc function to retrieve the dynamic set of columns

Related

Postgres: How do I count occurrences of each enum value when they exist in columns as an array?

I have an enum State which can contain values like CA, NY, etc.
If I have a table Users , with a column states that contains an array of State values, so for example {CA, NY} how can I write a query to count the users grouped by each State value? so for {CA, NY} that should count 1 for CA and 1 for NY
So If I had records like:
| id | states |
| -- | ------- |
| 1 | {CA,NY} |
| 2 | {CA} |
| 3 | {NV,CA} |
I would expect a query to output:
| State | count |
| ----- | ----- |
| CA | 3 |
| NV | 1 |
| NY | 1 |
The first piece of advice is to normalise your data. You are breaking 2nd Normal form by holding multiple pieces of information in a single column.
Assuming you can't change that, then you will need to SPLIT the data like this
enter link description here
and you can then COUNT() and group it.

Pivot SSRS Dataset

I have a dataset which looks like so
ID | PName | Node | Val |
1 | Tag | Name | XBA |
2 | Tag | Desc | Dec1 |
3 | Tag | unit | Int |
6 | Tag | tids | 100 |
7 | Tag | post | AAA |
1 | Tag | Name | XBB |
2 | Tag | Desc | Des9 |
3 | Tag | unit | Float |
7 | Tag | post | BBB |
6 | Tag | tids | 150 |
I would like the result in my report to be
Name | Desc | Unit | Tids | Post |
XBA | Dec1 | int | 100 | AAA |
XBB | Des9 | Float | 150 | BBB |
I have tried using a SSRS Matrix with
Row: PName
Data: Node
Value: Val
The results were simply one row with Name and next row with desc and next with unit etc. Its not all in the same rows and also the second row was missing. This is possibly because there is no grouping on the dataset.
What is a good way of achieving the expected results?
I would not recommend this for a production scenario but if you need to knock out a report quickly or something you can try this. I would just not feel comfortable that the order of the records you get will always be what you expect.
You COULD try to insert the results of the SP into a table (regular table, temp table, table variable...doesn't matter really as long as you can get an identity column added). Assuming that the rows always come out in the correct order (which is probably not a valid assumption 100% of the time) then add an identity column on the table to get a unique row number for each row. From there you should be able to write some math logic to "group" your values together and then pivot out what you want.
create table #temp (ID int, PName varchar(100), Node varhar(100), Val varchar(100))
insert #temp exec (your stored proc)
alter table #temp add UniqueID int identity
then use UniqueID (modulo on 5 perhaps?) to group records together and then pivot

BigQuery Match Table Lookup for DCM Data Transfer

With DCM's Data Transfer v2 you get 3 main tables of data in GCS:
p_activity_166401
p_click_166401
p_impression_166401
Along with a plethora of match tables like:
p_match_table_advertisers_166401
p_match_table_campaigns_166401
Table 1: p_activity_166401
Row | Event_time | User_ID | Advertiser_ID | Campaign_ID |
------ | ------------- | ------- | ------------- | ----------- |
1 | 149423090566 | AMsySZa | 5487307 | 9638421 |
2 | 149424804284 | 2vmdsXS | 5487307 | 10498283 |
Table 2: p_match_table_advertisers_166401
Row | Advertiser_ID | Advertiser |
------ | ------------- | ----------- |
1 | 5487307 | Company A |
2 | 5487457 | Company B |
How do I reference a value from Table 1 in Table 2 and return the value from Table 2 in a query?
I'd like a result like:
Row | Advertiser | User_ID |
------ | ---------- | ----------- |
1 | Company A | AMsySZa |
2 | Company A | 2vmdsXS |
Been searching around here and online and I just can't seem to find a clear reference on how to do the lookups across table, apologies in advance is this is a really simple thing I'm missing :)
EDIT
So with a nudge in the right direction I have found the JOIN function...
SELECT
*
FROM
[dtftv2_sprt.p_activity_166401]
INNER JOIN
[dtftv2_sprt.p_match_table_advertisers_166401]
ON
[p_activity_166401.Advertiser_ID] =
p_match_table_advertisers_166401.Advertiser_ID]
LIMIT
100;
Error: Field 'p_activity_166401.Advertiser_ID' not found.
That is definitely a field in the table.
So this query works great in creating a view with all the data in it.
SELECT
*
FROM
[dtftv2_sprt.p_activity_166401]
INNER JOIN
[dtftv2_sprt.p_match_table_advertisers_166401]
ON
dtftv2_sprt.p_activity_166401.Advertiser_ID = dtftv2_sprt.p_match_table_advertisers_166401.Advertiser_ID;
Using the view I can now run smaller queries to pull the data I want out. Thanks for guiding me in the right direction Mikhail Berlyant.

How to crossreference and combine values from many tables

I have three tables, tblTemplates, tblBLNALM and tblPrefs. They follow this structure:
tblPrefs:
--------------------------------------------
| Pref | Derived-Template | Template |
--------------------------------------------
|GA |BLNALM_F03 |AIN_F03 |
--------------------------------------------
|HSSD |BLNALM_F01 |AIN_F01 |
-------------------------------------------- etc...
tblBLNALM:
------------------------------------------------------------------
| Controller | Compound | Tagname | BaseTemplate | Name |
------------------------------------------------------------------
|15CP42 |15F00 |HSSD30001C |BLNALM |IN_7 |
------------------------------------------------------------------
|15CP12 |15F06 |GA123456 |BLNALM |IN_3 |
------------------------------------------------------------------ etc...
tblTemplates:
---------------------------------------
| Template | Maintenance Override |
---------------------------------------
|AIN_F01 |IN_7 |
---------------------------------------
|AIN_F02 |IN_5 |
---------------------------------------
|AIN_F03 |IN_7 |
---------------------------------------etc...
What I need to do is to look if the characters before the numbers start in tblBLNALM.Tagname exist in tblPrefs, if they do, use this to determine what template it is. Then using this template and tblTemplates work out what Maintenance override it is.
The end result should look kind of like this:
-----------------------------------------------------------------------------
| Controller | Compound | Tagname | Template | Maintenance Override |
-----------------------------------------------------------------------------
|15CP12 |15F06 |GA123456 |AIN_F03 |IN_7 |
----------------------------------------------------------------------------- etc...
My gut instinct was to use a few EXISTS statements and maybe nest them, but this hasn't helped, so where do I go from here?
I'm using msaccess 2010.
You can use string operations within SQL joins.
how about comparing if the tagname begins with your pref?
in SQL that would be:
SELECT tblBLNALM.Controller,
tblBLNALM.Compound,
tblBLNALM.Tagname,
tblTemplates.Template,
tblTemplates.[Maintenance Override]
FROM (tblTemplates
INNER JOIN tblPrefs ON tblTemplates.Template = tblPrefs.Template)
INNER JOIN tblBLNALM ON (tblPrefs.Pref = left(tblBLNALM.Tagname, len(tblPrefs.Pref)));
output will be as you described:
+------------+----------+------------+----------+----------------------+
| Controller | Compound | Tagname | Template | Maintenance Override |
+------------+----------+------------+----------+----------------------+
| 15CP12 | 15F06 | GA123456 | AIN_F03 | IN_7 |
| 15CP42 | 15F00 | HSSD30001C | AIN_F01 | IN_7 |
+------------+----------+------------+----------+----------------------+
Join 3 tables: join Template fields in tblPrefs and tblTemplates, then you should join Tagname of tblBLNALM and Pref, but here you cannot join fields directly, so create a query, where select all columns from tblBLNALM and add a calculated column, which returns starting letters from Tagname field and use it in join with tblPrefs instead of table.

SQL for calculated column that chooses from value in own row

I have a table in which several indentifiers of a person may be stored. In this table I would like to create a single calculated identifier column that stores the best identifier for that record depending on what identifiers are available.
For example (some fictional sample data) ....
Table = "Citizens"
Id | LastName | DL-No | SS-No | State-Id-No | Calculated
------------------------------------------------------------------------
1 | Smith | NULL | 374-784-8888 | 7383204848 | ?
2 | Jones | JG892435262 | NULL | NULL | ?
3 | Trask | TSK73948379 | NULL | 9276542119 | ?
4 | Clinton | CL231429888 | 543-123-5555 | 1840430324 | ?
I know the order in which I would like choose identifiers ...
Drivers-License-No
Social-Security-No
State-Id-No
So I would like the calculated identifier column to be part of the table schema. The desired results would be ...
Id | LastName | DL-No | SS-No | State-Id-No | Calculated
------------------------------------------------------------------------
1 | Smith | NULL | 374-784-8888 | 7383204848 | 374-784-8888
2 | Jones | JG892435262 | NULL | 4537409273 | JG892435262
3 | Trask | NULL | NULL | 9276542119 | 9276542119
4 | Clinton | CL231429888 | 543-123-5555 | 1840430324 | CL231429888
IS this possible? If so what SQL would I use to calculate what goes in the "Calculated" column?
I was thinking of something like ..
SELECT
CASE
WHEN ([DL-No] is NOT NULL) THEN [DL-No]
WHEN ([SS-No] is NOT NULL) THEN [SS-No]
WHEN ([State-Id-No] is NOT NULL) THEN [State-Id-No]
AS "Calculated"
END
FROM Citizens
The easiest solution is to use coalesce():
select c.*,
coalesce([DL-No], [SS-No], [State-ID-No]) as calculated
from citizens c
However, I think your case statement will also work, if you fix the syntax to use when rather than where.