SQL Query between 3 tables that related by ids - sql

I have a 3 Tables like this:
Table 1: Category (ID, Description)
Table 2: SubCategory (ID, Description, CategoryParent_ID)
Table 3: Items (ID, SubCategory_ID, Info, Documentation, etc...)
where SubCategory_ID in items table refers to SubCategory Table, & this last one refers to Category Table by CategoryParent_ID.
enter image description here
I want to make a query, that:
When I select a Category from Table 1, Every item in Table 3 that related to this Category is shown (via SubCategory)
Example: I select IT Equipment from table 1
data shown must be: Every item in table 3 that is related to Table 2 AND Table 2 get its reference from TABLE 1

You probably want to look into joins.
SELECT *
FROM tableOne, tableTwo, tableThree
WHERE tableOne.ID = tableTwo.CategoryParent_ID AND tableTwo.ID = tableThree.SubCategory_ID
You can also avoid these joins in the "where" clause by explicitly placing them in the "from" clause.
For example (jarlh's way):
SELECT *
FROM tableOne
INNER JOIN tableTwo ON tableOne.ID = tableTwo.CategoryParent_ID
INNER JOIN tableThree ON tableTwo.ID = tableThree.SubCategory_ID
Here's some info:
https://www.w3schools.com/sql/sql_join.asp

Related

Postgres SQL INNER JOIN AND ARRAY_AGG

I have three simple tables:
Category which stores my list of different types of categoriesCategory Table
Items, which stores my items:
JunctionTable table, as a connection between category and items - relation N:M
item_list_category junction table
So my problem is, I would like to create select which will select data from all mentioned tables. First I tried this:
SELECT item_list_id, ARRAY_AGG(category_id) AS array_category
FROM item_list_category
WHERE item_list_id = 1
GROUP BY item_list_id;
With this result:
Attempt 1
Then I wanted to Join tables:
SELECT item_list_id, ARRAY_AGG(category_id) AS array_category
FROM item_list_category
WHERE item_list_id = 1
GROUP BY item_list_id;
This is result, but this is not obviously what I need.Attempt 2
I expected for instance: {"Dětské", "Misteriozní"} etc.

How to delete records in BigQuery based on values in an array?

In Google BigQuery, I would like to delete a subset of records, based on the value of a specific column. It's a query that I need to run repeatedly and that I would like to run automatically.
The problem is that this specific column is of the form STRUCT<column_1 ARRAY (STRING), column_2 ARRAY (STRING), ... >, and I don't know how to use such a column in the where-clause when using the delete-command.
Here is basically what I am trying to do (this code does not work):
DELETE
FROM dataset.table t
LEFT JOIN UNNEST(t.category.column_1) AS type
WHERE t.partition_date = '2020-07-22'
AND type = 'some_value'
The error that I'm getting is: Syntax error: Expected end of input but got keyword LEFT at [3:1]
If I replace the DELETE with SELECT *, it does work:
SELECT *
FROM dataset.table t
LEFT JOIN UNNEST(t.category.column_1) AS type
WHERE t.partition_date = '2020-07-22'
AND type = 'some_value'
Does somebody know how to use such a column to delete a subset of records?
EDIT:
Here is some code to create a reproducible example with some silly data (fill in your own dataset and table name in all queries):
Suppose you want to delete all rows where category.type contains the value 'food'.
1 - create a table:
CREATE TABLE <DATASET>.<TABLE_NAME>
(
article STRING,
category STRUCT<
color STRING,
type ARRAY<STRING>
>
);
2 - Insert data into the new table:
INSERT <DATASET>.<TABLE_NAME>
SELECT "apple" AS article, STRUCT('red' AS color, ['fruit','food'] as type) AS category
UNION ALL
SELECT "cabbage" AS article, STRUCT('blue' AS color, ['vegetable', 'food'] as type) AS category
UNION ALL
SELECT "book" AS article, STRUCT('red' AS color, ['object'] as type) AS category
UNION ALL
SELECT "dog" AS article, STRUCT('green' AS color, ['animal', 'pet'] as type) AS category;
3 - Show that select works (return all rows where category.type contains the value 'food'; these are the rows I want to delete):
SELECT *
FROM <DATASET>.<TABLE_NAME>
LEFT JOIN UNNEST(category.type) type
WHERE type = 'food'
Initial Result
4 - My attempt at deleting rows where category.type contains 'food' does not work:
DELETE
FROM <DATASET>.<TABLE_NAME>
LEFT JOIN UNNEST(category.type) type
WHERE type = 'food'
Syntax error: Unexpected keyword LEFT at [3:1]
Desired Result
This is the code I used to delete the desired records (the records where category.type contains the value 'food'.)
DELETE
FROM <DATASET>.<TABLE_NAME> t1
WHERE EXISTS(SELECT 1 FROM UNNEST(t1.category.type) t2 WHERE t2 = 'food')
The embarrasing thing is that I've seen these kind of answers on similar questions (for example on update-queries). But I come from Oracle-SQL and I think that there you are required to connect your subquery with your main query in the WHERE-statement of the subquery (ie. connect t1 with t2), so I didn't understand these answers. That's why I posted this question.
However, I learned that BigQuery automatically understands how to connect table t1 and 'table' t2; you don't have to explicitly connect them.
Now it is possible to still do this (perhaps even recommended?):
DELETE
FROM <DATASET>.<TABLE_NAME> t1
WHERE EXISTS (SELECT 1 FROM <DATASET>.<TABLE_NAME> t2 LEFT JOIN UNNEST(t2.category.type) AS type WHERE type = 'food' AND t1.article=t2.article)
but a second difficulty for me was that my ID in my actual data is somehow hidden in an array>struct-construction, so I got stuck connecting t1 & t2. Fortunately this is not always an absolute necessity.
Since you did not provide any sample data I am going to explain using some dummy data. In case you add your sample data, I can update the answer.
Firstly,according to your description, you have only a STRUCT not an Array[Struct <col_1, col_2>].For this reason, you do not need to use UNNEST to access the values within the data. Below is an example how to access particular data within a STRUCT.
WITH data AS (
SELECT 1 AS id, STRUCT("Alex" AS name, 30 AS age, "NYC" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Leo" AS name, 18 AS age, "Sydney" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Robert" AS name, 25 AS age, "Paris" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Mary" AS name, 28 AS age, "London" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Ralph" AS name, 45 AS age, "London" AS city) AS info
)
SELECT * FROM data
WHERE info.city = "London"
Notice that the STRUCT is named info and the data we accessed is city and used it in the WHERE clause.
Now, in order to delete the rows that contains an specific value within the STRUCT , in your case I assume it would be your_struct.column_1, you can use DELETE or MERGE and DELETE. I have saved the above data in a table to execute the below examples, which have the same output,
First method: DELETE
DELETE FROM `project.dataset.table`
WHERE info.city = "Sydney"
Second method: MERGE and DELETE
MERGE `project.dataset.table` a
USING (SELECT * from `project.dataset.table` WHERE info.city ="London") b
ON a.info.city =b.info.city
WHEN matched and b.id=1 then
Delete
And the output for both queries,
Row id info.name info.age info.city
1 1 Alex 30 NYC
2 1 Robert 25 Paris
3 1 Ralph 45 London
4 1 Mary 28 London
As you can see the row where info.city = "Sydney" was deleted in both cases.
It is important to point out that your data is excluded from your source table. Therefore, you should be careful.
Note: Since you want to run this process everyday, you could use Schedule Query within BigQuery Console, appending or overwriting the results after each run. Also, it is a good practice not deleting data from your source table. Thus, consider creating a new table from your source table without the rows you do not desire.

SQL: At least one value exists in another table

I am trying to create a table that has columns called user_id and top5_foods (binary column). I currently have two tables, one has all of the user_ids and the foods associated with those user_ids and one table that only contains the top5 foods according to a type of calculation to select the top5 foods.
The table that I am trying to create if to have the column of the user_id and if at least one of their favorite foods is in the top_5_food table, put the value of the top5_foods as 1 and if not, 0.
Something like the following:
user_id top5_foods
----------------------
34223 1
43225 0
34323 1
I have tried to use the CASE command but it just duplicated the user_ids and mark 1 or 0 whenever it finds a food that is in the top_5_foods table. But I don't want it to duplicate. Could you please help ?
Thank you very much
If I understand correctly, a left join and aggregation:
select uf.user_id,
(count(t.food_id) > 0) as top5_foods
from user_foods uf left join
top5_foods t
on uf.food_id = t.food_id
group by uf.user_id;

SQL query not returning rows if empty

I am new to SQL and I would like to get some insights for my problem
I am using the following query,
select id,
pid
from assoc
where id in (100422, 100414, 100421, 100419, 100423)
All these id need not have pid, some doesn't and some has pid. Currently it skips the records which doesn't have pid.
I would like a way which would show the results as below.
pid id
-----------
703 100422
313 100414
465 100421
null 100419
null 100423
Any help would be greatly appreciated. Thanks!
Oh, I think I've got the idea: you have to enumerate all the ids and corresponding pids. If there's no corresponding pid, put null (kind of outer join). If it's your case, then Oracle solution can be:
with
-- dummy: required ids
dummy as (
select 100422 as id from dual
union all select 100414 as id from dual
union all select 100421 as id from dual
union all select 100419 as id from dual
union all select 100423 as id from dual),
-- main: actual data we have
main as (
select id,
pid
from assoc
-- you may put "id in (select d.id from dummy d)"
where id in (100422, 100414, 100421, 100419, 100423))
-- we want to print out either existing main.pid or null
select main.pid as pid,
dummy.id as id
from dummy left join main on dummy.id = main.id
id is obtained from other table and assoc only has pid associated with id.
The assoc table seems to be the association table used to implement a many-to-many relationship between two entities in a relational database.
It contains entries only for the entities from one table that are in relationship with entities from the other table. It doesn't contain information about the entities that are not in a relationship and some of the results you want to get come from entities that are not in a relationship.
The solution for your problem is to RIGHT JOIN the table where the column id comes from and put the WHERE condition against the values retrieved from the original table (because it contains the rows you need). The RIGHT JOIN ensures all the matching rows from the right side table are included in the result set, even when they do not have matching rows in the left side table.
Assuming the table where the id column comes from is named table1, the query you need is:
SELECT assoc.id, assoc.pid
FROM assoc
RIGHT JOIN table1 ON assoc.id = table1.id
WHERE table1.id IN (100422, 100414, 100421, 100419, 100423)

Oracle SQL statement for one to many to append the multiple records (fields) to one string

I am writing a SQL statement for Oracle where there is a one to many relationship between two tables. The table Person has a foreign key to table Purchase which has a Purchase Description field.
I need to write a SELECT query that will take all the purchase records/rows and append them to each other like so
Person Table
PersonID PersonName
1 John
Purchases Table
PurchaseId (PK), PersonID(FK), PurchaseDescription
1 1 Book
2 1 Clothes
3 1 Bag
4 1 Dinner
So the output of the query would look like this
Output = 1, Book:Bag:Clothes:Dinner
The output will be one row from the one to many relationship where there are separate records for book, bag, clothes, and dinner.
Any help is appreciated. Thanks
to do this use a function called LISTAGG, like this
SELECT 'Output = '||CAST(P.PersonID AS VARCHAR(100)), LISTAGG(Pur.PurchaseDescription, ':')
FROM Person P
LEFT JOIN Purchase Pur ON P.PersonID = Pur.PersonID
GROUP BY P.PersonID