BigQuery Match Table Lookup for DCM Data Transfer - google-bigquery

With DCM's Data Transfer v2 you get 3 main tables of data in GCS:
p_activity_166401
p_click_166401
p_impression_166401
Along with a plethora of match tables like:
p_match_table_advertisers_166401
p_match_table_campaigns_166401
Table 1: p_activity_166401
Row | Event_time | User_ID | Advertiser_ID | Campaign_ID |
------ | ------------- | ------- | ------------- | ----------- |
1 | 149423090566 | AMsySZa | 5487307 | 9638421 |
2 | 149424804284 | 2vmdsXS | 5487307 | 10498283 |
Table 2: p_match_table_advertisers_166401
Row | Advertiser_ID | Advertiser |
------ | ------------- | ----------- |
1 | 5487307 | Company A |
2 | 5487457 | Company B |
How do I reference a value from Table 1 in Table 2 and return the value from Table 2 in a query?
I'd like a result like:
Row | Advertiser | User_ID |
------ | ---------- | ----------- |
1 | Company A | AMsySZa |
2 | Company A | 2vmdsXS |
Been searching around here and online and I just can't seem to find a clear reference on how to do the lookups across table, apologies in advance is this is a really simple thing I'm missing :)
EDIT
So with a nudge in the right direction I have found the JOIN function...
SELECT
*
FROM
[dtftv2_sprt.p_activity_166401]
INNER JOIN
[dtftv2_sprt.p_match_table_advertisers_166401]
ON
[p_activity_166401.Advertiser_ID] =
p_match_table_advertisers_166401.Advertiser_ID]
LIMIT
100;
Error: Field 'p_activity_166401.Advertiser_ID' not found.
That is definitely a field in the table.

So this query works great in creating a view with all the data in it.
SELECT
*
FROM
[dtftv2_sprt.p_activity_166401]
INNER JOIN
[dtftv2_sprt.p_match_table_advertisers_166401]
ON
dtftv2_sprt.p_activity_166401.Advertiser_ID = dtftv2_sprt.p_match_table_advertisers_166401.Advertiser_ID;
Using the view I can now run smaller queries to pull the data I want out. Thanks for guiding me in the right direction Mikhail Berlyant.

Related

Snowflake Create View with JSON (VARIANT) field as columns with dynamic keys

I am having a problem creating VIEWS with Snowflake that has VARIANT field which stores JSON data whose keys are dynamic and keys definition is stored in another table. So I want to create a VIEW that has dynamic columns based on the foreign key.
Here are my table looks like:
companies:
| id | name |
| -- | ---- |
| 1 | Company 1 |
| 2 | Company 2 |
invoices:
| id | invoice_number | custom_fields | company_id |
| -- | -------------- | ------------- | ---------- |
| 1 | INV-01 | {"1": "Joe", "3": true, "5": "2020-12-12"} | 1 |
| 2 | INV-01 | {"2":"Hello", "4": 1000} | 2 |
customization_fields:
| id | label | data_type | company_id |
| -- | ----- | --------- | ---------- |
| 1 | manager | text | 1 |
| 2 | reference | text | 2 |
| 3 | emailed | boolean | 1 |
| 4 | account | integer | 2 |
| 5 | due_date | date | 1 |
So I want to create a view for getting each companies invoices something like:
CREATE OR REPLACE VIEW companies_invoices AS SELECT * FROM invoices WHERE company_id = 1
which should get a result like below:
| id | invoice_number | company_id | manager | emailed | due_date |
| -- | -------------- | ---------- | ------- | ------- | -------- |
| 1 | INV-01 | 1 | Joe | true | 2020-12-12 |
So my challenge above here is I cannot make sure the keys when I write the query. If I know that I could write
SELECT
id,
invoice_number,
company_id,
custom_fields:"1" AS manager,
custom_fields:"3" AS emailed,
custom_fields:"5" AS due_date
FROM invoices
WHERE company_id = 1
These keys and labels are written in the customization_fields table, so I tried different ways and I am not able to do that.
So could anyone tell me if we can do or not? If we can please give me an example so it would really help.
You cannot do what you want to do with a view. A view has a fixed set of columns and they have specific types. Retrieving a dynamic set of columns requires some other mechanism.
If you're trying to change the number of columns or the names of the columns based on the rows in the customization_fields table, you can't do it in a view.
If you have a defined schema and just need to grab dynamic JSON properties, you may want to consider looking into Snowflake's GET function. It allows you to get any part of a JSON using a string for the path rather than using a literal path in the SQL statement. For example:
create temp table foo(v variant);
insert into foo select parse_json('{ "name":"John", "age":30, "car":null }');
-- This uses a literal path in the SQL to get to a JSON property
select v:name::string as first_name from foo;
-- This uses the GET function to get the value from a path in a string
select get(v, 'name')::string as first_name from foo;
You can replace the 'name' in the second parameter of the GET function with the value stored in the customization_fields table.
In SF, You will have to use a Stored Proc function to retrieve the dynamic set of columns

Filling information from same and other tables in SQL

For my further work I need to create a lookup table where all the different IDs my data has (because of different sources) are noted.
It has to look like this:
Lookup_Table:
| Name | ID_source1 | ID_source2 | ID_source3 |
-----------------------------------------------
| John | EMP_992 | AKK81239K | inv1000003 |
Note, that Name and ID_Source1 are coming from the same table. The other IDs are coming from different tables. They share the same name value, so e.g. source 2 looks like this:
Source2 Table:
| Name | ID |
--------------------
| John | AKK81239K |
What is the SQL code to accomplish this? Im using Access and it doesnt seem to work with this code for source 2:
INSERT INTO Lookup_Table ([ID_Source2])
SELECT [Source2].[ID]
FROM Lookup_Table LEFT JOIN [Source2]
ON [Lookup_Table].[Name] = [Source2].[Name]
It just adds the ID from Source2 in a new row:
| Name | ID_source1 | ID_source2 | ID_source3 |
-----------------------------------------------
| John | EMP_992 | | |
| | | AKK81239K | |
Hope you guys can help me.
You're looking for an UPDATE query, not an INSERT query.
An UPDATE query updates existing records. An INSERT query inserts new records into a table.
UPDATE Lookup_Table
INNER JOIN [Source2] ON [Lookup_Table].[Name] = [Source2].[Name]
SET [ID_Source2] = [Source2].[ID]

Codeigniter - get related entries

I have two tables, one that store some data like name, sex etc. end the seconds that store images related from the id of the first table.
I need to get the row of the first table and get also all the images from the second one and I would like to get them with a single query.
How can I do that? is it possibile?
I have tried with a simple join but I get just the first element of the second table.
Example Table 1:
----------------------
|ID | Name | Sex |
----------------------
| 1 | Chris | Male |
----------------------
| 2 | Elisa | Female |
----------------------
Table 2:
------------------------------
| ID | User_id | Image name |
------------------------------
| 1 | 1 | img1.jpg |
------------------------------
| 2 | 1 | img2.jpg |
------------------------------
| 3 | 1 | img3.jpg |
------------------------------
| 4 | 2 | img4.jpg |
------------------------------
I would like to have:
Chris, Male, img1.jpg, img2.jpg, img3.jpg
Elisa, Female, img4.jpg
And so on...any advice?
SELECT
table1.*,
GROUP_CONCAT(table2.image_name SEPARATOR ', ') as image_name
FROM table1
LEFT JOIN table2
ON table1.id = table2.user_id
GROUP BY table1.id;
Demo : SQL Fiddle
Try with a left outer join
SELECT user.*, image.image_name as image
FROM user
LEFT OUTER JOIN image
ON user.id = image.user_id
It will get all the user rows and put empty val in image if none was found.

Display another field in the referenced table for multiple columns with performance issues in mind

I have a table of edge like this:
-------------------------------
| id | arg1 | relation | arg2 |
-------------------------------
| 1 | 1 | 3 | 4 |
-------------------------------
| 2 | 2 | 6 | 5 |
-------------------------------
where arg1, relation and arg2 reference to the ids of objects in another object table:
--------------------
| id | object_name |
--------------------
| 1 | book |
--------------------
| 2 | pen |
--------------------
| 3 | on |
--------------------
| 4 | table |
--------------------
| 5 | bag |
--------------------
| 6 | in |
--------------------
What I want to do is that, considering performance issues (a very big table more than 50 million of entries) display the object_name for each edge entry rather than id such as:
---------------------------
| arg1 | relation | arg2 |
---------------------------
| book | on | table |
---------------------------
| pen | in | bag |
---------------------------
What is the best select query to do this? Also, I am open to suggestions for optimizing the query - adding more index on the tables etc...
EDIT: Based on the comments below:
1) #Craig Ringer: PostgreSQL version: 8.4.13 and only index is id for both tables.
2) #andrefsp: edge is almost x2 times bigger than object.
If you can change the structure of the database, you may try to denormalize this part of the database and make table edge with fields id, arg1_name, relation_name, arg2_name. And keep table object without changes to take names for the edge table when you insert or update it.
It is not good. Your data will be duplicates (size of the database will be greater) and it may be difficult to insert or update tables.
But it should be fast to select (no JOINs):
SELECT arg1_name, relation_name, arg2_name
FROM edge;
It won't get cheaper than this:
SELECT o1.object_name, r1.object_name, o2.object_name
FROM edge e
JOIN object o1 ON o1.id = e.arg1
JOIN object r ON r.id = e.relation
JOIN object o2 ON o2.id = e.arg2;
And you don't need more indexes. The one on object.id is the only one needed for this query.
But I seriously doubt that you want to retrieve 50 millions of rows at once, and in no particular order. You still didn't give the full picture.

PostgreSQL Inner Join on the same table + second table?

If this is a stupid question, forgive me, I'm not very familiar with PostgreSQL.
I've collected inventory data from used car dealerships in my area and stored it in a postgreSQL table. I've got a second table with particular details regarding certain makes and models. For example:
The dealership table is structured like so:
-----------------------------------------
| Dealership | Make | Model | Year | ID |
----------------------------------------|
| A | Ford | F250 | 2003 | 1 |
| A | Chevy| Cobalt| 2005 | 2 |
| B | Ford | F250 | 2003 | 1 |
| B | Dodge| Chrgr | 2012 | 3 |
-----------------------------------------
The details table is structured like so:
-----------------------------------------
| ID | DetailA| DetailB| DetailC|
-----------------------------------------
| 1 | data | data | data |
| 2 | data | data | data |
| 3 | data | data | data |
| 4 | data | data | data |
-----------------------------------------
My goal is to retrieve vehicle matches from multiple dealerships and display the appropriate details. In the above example, I would like to see:
-----------------------------------------------------
| Make | Model | Year | DetailA | DetailB | DetailC |
-----------------------------------------------------
| Ford | F250 | 2003 | data | data | data |
-----------------------------------------------------
With this result, I will know that both A and B havea 2003 Ford F250 for sale, and can view the related details of the vehicle.
I've tried many different queries, but most are variations on something like this:
SELECT DISTINCT
dealership_table.make,
dealership_table.model,
dealership_table.year
details_table.detaila,
details_table.detailb,
details_table.detailc
FROM
dealership_table
INNER JOIN
details_table
ON
dealership_table.id = details_table.id
WHERE
dealership_table.dealership = 'A'
OR
dealership_table.dealership = 'B'
However this returns all of the distinct matches from the table where dealership is either A or B. I've tried multiple inner-joins, but I an error complaining details_table is specified multiple times.
If I'm doing something really silly, I apologize. Like I said before, I'm pretty much an SQL noob.
What am I doing wrong? How should I go about retrieving the desired results? Any suggestions, solutions, or advice is greatly appreciated!
You can write:
SELECT dealership_table1.make,
dealership_table1.model,
dealership_table1.year,
details_table.detaila,
details_table.detailb,
details_table.detailc
FROM dealership_table dealership_table1
JOIN dealership_table dealership_table2
ON dealership_table1.make = dealership_table2.make
AND dealership_table1.model = dealership_table2.model
AND dealership_table1.year = dealership_table2.year
JOIN details_table
ON dealership_table.id = details_table.id
WHERE dealership_table1.dealership = 'A'
AND dealership_table1.dealership = 'B'
;
(Note that the FROM dealership_table dealership_table1 and JOIN dealership_table dealership_table2 set up distinct "aliases", so you can use the same table multiple different times in the same query without getting name-conflicts.)
I may be misunderstanding your table layout, but I think you should consider changing to a different structure. Here's what I would propose:
Vehicle:
----------------------------
| ID | Make | Model | Year |
----------------------------
| 1 | Ford | F250 | 2003 |
| 2 | Chevy| Cobalt| 2005 |
| 3 | Dodge| Chrgr | 2012 |
----------------------------
Dealership:
----------------------------
| Dealership | ID | Detail |
----------------------------
| A | 1 | data |
| A | 2 | data |
| B | 1 | data |
| B | 3 | data |
----------------------------
This way you're not storing vehicle information (make/model/year) in more than one place.
Here's how you would write your desired query given the above schema:
SELECT Make, Model, Year, A.Detail, B.Detail, C.Detail
FROM Vehicle V
LEFT OUTER JOIN Dealership A on A.Dealership = 'A' and A.id = V.id
LEFT OUTER JOIN Dealership B on B.Dealership = 'B' and B.id = V.id
LEFT OUTER JOIN Dealership C on C.Dealership = 'C' and C.id = V.id