How to write a query by joining two tables? - sql

I have two tables:
Table 1:
ID NAME
1 ID1
2 ID2
3 ID3
4 ID4
5 ID5
6 ID6
7 ID7
Table 2:
Parent_ID Child_ID
1 2
2 5
2 3
3 6
How do I write a query to get below output if I assign Parent_Id = 1 in where condition?
P_ID NAME Is_Group Selected
1 ID1 Yes No
2 ID2 Yes Yes
3 ID3 Yes Yes
4 ID4 No No
5 ID5 No Yes
6 ID6 No Yes
7 ID7 No No
So, output mainly contains records from table one but also it need to have two additional columns.
Value in Is_Group column should be "Yes" if ID from Table 1 exists in Parent_ID column in Table 2. Value in Selected column should be "yes" if ID from Table 1 exists in Child_ID column in Table 2 and Parent_ID = 1 (like a cross reference).
In additional, I need to check if a Child_ID has any cross reference. For example In Table 2 Child_ID is 2 for Parent_Id 1, 2 also has 5 and 3 as child_Id so I need to have Selected column values as "Yes" for Id's 3 and 5 and so on.
Thanks in advance for your reply. Sorry for my English.

This should give you the output you need.
It uses a recursive cte to get the hierarchy.
Then outer joins to the cte twice to determine if the the ID is a Group, or Selected by checking for null values
WITH cte AS
(
SELECT Parent_ID,
Child_ID
FROM Table2
WHERE Parent_ID = 1
UNION ALL
SELECT t2.Parent_ID,
t2.Child_ID
FROM Table2 t2
INNER JOIN cte ON t2.Parent_ID = cte.Child_ID
)
SELECT DISTINCT
t1.*,
(CASE WHEN grp.Parent_ID IS NULL THEN 'No'
ELSE 'Yes'
END) AS Is_Group,
(CASE WHEN sel.Parent_ID IS NULL THEN 'No'
ELSE 'Yes'
END) AS Selected
FROM Table1 t1
LEFT JOIN cte grp ON t1.ID = grp.Parent_ID
LEFT JOIN cte sel ON t1.ID = sel.Child_ID
The fact that you're selecting everything from Table1 regardless of the whether it's in the selected hierarchy or not would give you No for Is_Group for any ID's that are Parent_IDs, but not actually in the hierachy cte. To always determine if an ID is a Group or not, just left join to Table2 as grp instead of the cte.. like.
;WITH cte AS
(
SELECT Parent_ID,
Child_ID
FROM Table2
WHERE Parent_ID = 1
UNION ALL
SELECT t2.Parent_ID,
t2.Child_ID
FROM Table2 t2
INNER JOIN cte ON t2.Parent_ID = cte.Child_ID
)
SELECT DISTINCT
t1.*,
(CASE WHEN grp.Parent_ID IS NULL THEN 'No'
ELSE 'Yes'
END) AS Is_Group,
(CASE WHEN sel.Parent_ID IS NULL THEN 'No'
ELSE 'Yes'
END) AS Selected
FROM Table1 t1
LEFT JOIN Table2 grp ON t1.ID = grp.Parent_ID
LEFT JOIN cte sel ON t1.ID = sel.Child_ID

try this,
select distinct id, t.NAME,
case when t1.Parent_ID is not null then 'Yes' else 'No' end Is_Group
,case when b.Child_ID is null then 'No' else 'Yes' end Selected
from Table1 t left join Table2 t1 on t.ID =t1.Parent_ID
outer apply (select Child_ID from Table2 a where a.Child_ID=t.ID ) b

Related

Multiple Unioned Self-joins in BigQuery

I have a table with id, name and parent_id where parent_id is a parent hierarchy relating to id, see below.
id
name
parent_id
0
A
null
1
B
0
2
C
1
3
D
1
4
E
2
I'm trying to create a nicer looking table with each id and its parent_id, including multiple levels up in the hierarchy. I use UNION and self-join to accomplish this, but I have a feeling there should be a nicer way of querying it with BigQuery's Standard SQL.
In the query below I go two levels, but you can imagine I want to go 5-6 levels.
WITH T1 as (
select 0 as id, 'A' as name, null as parent_id union all
select 1 as id, 'B' as name, 0 as parent_id union all
select 2 as id, 'C' as name, 1 as parent_id union all
select 3 as id, 'D' as name, 1 as parent_id union all
select 4 as id, 'E' as name, 2 as parent_id
)
SELECT
a.id as id,
a.name as req_name,
FROM T1 as a
UNION ALL
SELECT
a.id as id,
b.name as req_name,
FROM T1 as a
JOIN T1 as b ON a.parent_id = b.id
UNION ALL
SELECT
a.id as id,
c.name as req_name,
FROM T1 as a
JOIN T1 as b on a.parent_id = b.id
JOIN T1 as c on b.parent_id = c.id
resulting in the table
id
req_name
0
A
1
B
2
C
3
D
4
E
2
A
3
A
4
B
1
A
2
B
3
B
4
C
I would be thankful for any insights!
BigQuery does not (yet) support recursive or hierarchical queries. So your approach is actually fine. You can condense it, if you like, using left joins:
with t as (
select 0 as id, 'A' as name, null as parent_id union all
select 1 as id, 'B' as name, 0 as parent_id union all
select 2 as id, 'C' as name, 1 as parent_id union all
select 3 as id, 'D' as name, 1 as parent_id union all
select 4 as id, 'E' as name, 2 as parent_id
)
select distinct id, t1.name
from t t1 left join
t t2
on t2.parent_id = t1.id left join
t t3
on t3.parent_id = t2.id cross join
unnest(array[t1.id, t2.id, t3.id]) id
where id is not null;
You still need explicit joins to the maximum depth of the data.
The other alternative is to use a looping construct, which is available in the scripting language.

How multiple join to the same table with other table?

I am not sure of the question title!, but i have this problem
table1
id | from | to
1 A B
2 C A
3 B A
table2
id | table1_id
1 1
2 1
3 1
4 3
5 3
6 2
7 2
8 2
I need to get data from table1 considering rows with ids (2,3) as one row and joined with table2
fetch the last id between them which is 5
the result
id | from | to | table2_id
3 B | A | 5
2 C | A | 8
As you have not mentioned database, I have written SQL in SQL Server
WITH TAB_FLAT AS
(
SELECT TAB_1.table1_id , COALESCE ( TAB_2.table2_id , TAB_1.table2_id ) max_id , TAB_1.table2_id FROM
( select table1_id , max(id) table2_id from table2 group by table1_id ) TAB_1
LEFT JOIN
( select table1_id , max(id) table2_id from table2 group by table1_id ) TAB_2
ON TAB_1.table2_id = TAB_2.table1_id
)
select TAB_FLAT.table1_id AS ID, COALESCE(C.frm ,COALESCE(B.frm,A.frm) ) AS 'FROM' ,
COALESCE(C.to2 ,COALESCE(B.to2,A.to2) ) AS 'TO' ,
TAB_FLAT.max_id AS table2_id
FROM TAB_FLAT
LEFT JOIN table1 A ON A.id = TAB_FLAT.table1_id
LEFT JOIN table1 B ON B.id = TAB_FLAT.table2_id
LEFT JOIN table1 C ON C.id = TAB_FLAT.max_id
WHERE TAB_FLAT.table1_id IN (1,2)
Demo --> https://rextester.com/DCHUN74655
Explanation:
SELECT TAB_1.table1_id , COALESCE ( TAB_2.table2_id , TAB_1.table2_id ) max_id , TAB_1.table2_id FROM
( select table1_id , max(id) table2_id from table2 group by table1_id ) TAB_1
LEFT JOIN
( select table1_id , max(id) table2_id from table2 group by table1_id ) TAB_2
ON TAB_1.table2_id = TAB_2.table1_id
We are doing a self left join to get the largest table 2 ID for each table 1 ID
AND corresponding intermediate level.
select TAB_FLAT.table1_id AS ID, COALESCE(C.frm ,COALESCE(B.frm,A.frm) ) AS 'FROM' ,
COALESCE(C.to2 ,COALESCE(B.to2,A.to2) ) AS 'TO' ,
TAB_FLAT.max_id AS table2_id
FROM TAB_FLAT
LEFT JOIN table1 A ON A.id = TAB_FLAT.table1_id
LEFT JOIN table1 B ON B.id = TAB_FLAT.table2_id
LEFT JOIN table1 C ON C.id = TAB_FLAT.max_id
We are joining to table 1 based on table1 id , intermediate id and largest id
WHERE TAB_FLAT.table1_id IN (1,2)
Filtering out records with ID = 1 or 2
First, write a query to get each row's max ID in table 2.
select t1.*, max(t2.id) as table2_id
from table2 t2
join table1 t1 on t2.table1_id = t1.id
group by t1.id
Then use that as a CTE for two queries. One to get all the rows except 1 and 3. And one to get 1 or 3, whichever has a greater table2_id. union them together.
with max_table2 as (
select t1.*, max(t2.id) as table2_id
from table2 t2
join table1 t1 on t2.table1_id = t1.id
group by t1.id
)
select *
from max_table2
where id not in (1,3)
union
(
select *
from max_table2
where id in (1,3)
order by table2_id desc
limit 1
)
If the combined 1/3 row must have the ID of 1, despite 3 having the larger table2_id, this can be done by hard coding the ID in the select query.
with max_table2 as (
select t1.*, max(t2.id) as table2_id
from table2 t2
join table1 t1 on t2.table1_id = t1.id
group by t1.id
)
select *
from max_table2
where id not in (1,3)
union
(
select 1 as id, "from", "to", table2_id
from max_table2
where id in (1,3)
order by table2_id desc
limit 1
)
Try it.
I suspect rather than hard coding rows 1 and 3, you're actually counting them as equivalent because they have the same path, just reversed. We can make this query more generic.
First, normalize the from/to so they're in the same order. While we're at it, also get their max table2 id.
select
t1.id,
case when "from" < "to" then "from" else "to" end as "from",
case when "from" < "to" then "to" else "from" end as "to",
max(t2.id) as max_table2_id
from table2 t2
join table1 t1 on t2.table1_id = t1.id
group by t1.id
id from to max_table2_id
2 A C 8
3 A B 5
1 A B 3
Then rank the paths with the same from/to.
with normalized_max_table2 as (
select
t1.id,
case when "from" < "to" then "from" else "to" end as "from",
case when "from" < "to" then "to" else "from" end as "to",
max(t2.id) as table2_id
from table2 t2
join table1 t1 on t2.table1_id = t1.id
group by t1.id
)
select *,
rank() over (partition by "from", "to" order by table2_id desc) as "rank"
from normalized_max_table2
id from to table2_id rank
3 A B 5 1
1 A B 3 2
2 A C 8 1
And, finally, select only the first ranks.
with normalized_max_table2 as (
select
t1.id,
case when "from" < "to" then "from" else "to" end as "from",
case when "from" < "to" then "to" else "from" end as "to",
max(t2.id) as table2_id
from table2 t2
join table1 t1 on t2.table1_id = t1.id
group by t1.id
),
ranked_max_table2 as (
select *,
rank() over (partition by "from", "to" order by table2_id desc) as "rank"
from normalized_max_table2
)
select id, "from", "to", table2_id
from ranked_max_table2
where "rank" = 1
id from to table2_id
3 A B 5
2 A C 8
That's a long-hand way of doing it. There may be a more compact way.
Try it.

SQL Complex join not giving distinct result

I have two tables :-
Table1:-
ID1
1
1
1
1
4
5
Table2:-
Id2
2
2
1
1
1
8
I want to show all the ID2 from table2 which are present in ID1 of table1 by using joins
I used :-
select ID2 from Table2 t2 left join Table1 t1
on t2.Id2=t1.Id1
But this was giving repeated result as :-
Id2
1
1
1
1
1
1
1
It should show me 1 as 3 times only as it is present in Table2 3 times.
Please help.
You're matching the value 1 with 4 rows on Table1 and 3 rows on Table2 that's why you're seeing 12 rows. You need an additional JOIN condition. You can add a ROW_NUMBER and do an INNER JOIN to achieve your desired result.
WITH Cte1 AS(
SELECT *,
rn = ROW_NUMBER() OVER(PARTITION BY Id1 ORDER BY (SELECT NULL))
FROM Table1
),
Cte2 AS(
SELECT *,
rn = ROW_NUMBER() OVER(PARTITION BY Id2 ORDER BY (SELECT NULL))
FROM Table2
)
SELECT c2.Id2
FROM Cte2 c2
INNER JOIN Cte1 c1
ON c1.Id1 = c2.Id2
AND c1.rn = c2.rn
However, you can achieve the desired result without using a JOIN.
SELECT *
FROM Table2 t2
WHERE EXISTS(
SELECT 1 FROM Table1 t1 WHERE t1.Id1 = t2.Id2
)
It's the expected behavior of Join Operation. It will match every row from the two tables, so you will get 12 rows containing value 1 in result of join query.
You can use below query to get desired result.
select ID2 from Table2 t2 WHERE ID2 IN (SELECT ID1 FROM Table1 t1)
select id2 from table2 t2 where exists ( select 1 from table1 t1 where t1.id1 = t2.id2)
Your join logic works fine, the problem is each of your ID2 is matching against all ID1s. A simple solution would be to join with a table of distinct ID1s to avoid this duplication.
select
t2.ID2
from Table2 t2
left join (select distinct * from Table1) t1
on t1.Id1=t2.Id2
where t1.ID1 is not null
;
Here is a functional example
This will select your entire ID2 list with ID1 populated in a column. ID1 is null where there was no match. Select your ID2 column from this table but just don't pull null values (with where clause):

require to form a sql query

I was working on preparing a query where I was stuck.
Consider tables below:
table1
id key col1
-- --- -----
1 1 abc
2 2 d
3 3 s
4 4 xyz
table2
id col1 foreignkey
-- ---- ----------
1 12 1
2 13 1
3 14 1
4 12 2
5 13 2
Now what I need is to select only those records from table1 for which the corresponding entries in table2 does not have say col1 value as 12.
So the challenge is after applying join even though it will skip for value 1 corresponding to col1 equal to 12 it still has another multiple rows whose values are say 13, 14 for which also they have same foreignkey. Now what I want is if there is a single row having value 12 then it should not pick that id at all from table1.
How can I form a query with this?
The output which i need is say from above table structure i want to get those records from table1 for which col1 value from table2 does not have value as 14.
so my query should return me only row 2 from table1 and not row 1.
Another way of doing that. The first two queries are just for making the sample data.
;WITH t1(id ,[key] ,col1) AS
(
SELECT 1 , 1 , 'abc' UNION ALL
SELECT 2 , 2 , 'd' UNION ALL
SELECT 3 , 3 , 's' UNION ALL
SELECT 4 , 4 , 'xyz'
)
,t2(id ,col1, foreignkey) AS
(
SELECT 1 , 12 , 1 UNION ALL
SELECT 2 , 13 , 1 UNION ALL
SELECT 3 , 14 , 1 UNION ALL
SELECT 4 ,12 , 2 UNION ALL
SELECT 5 ,13 , 2
)
SELECT id, [key], col1
FROM t1
WHERE id NOT IN (SELECT t2.Id
FROM t2
INNER JOIN t1 ON t1.Id = t2.foreignkey
WHERE t2.col1 = 14)
This is a typical case for NOT EXISTS:
SELECT id, [key], col1
FROM table1 t1
WHERE NOT EXISTS (SELECT 1
FROM table2 t2
WHERE t2.foreignkey = t1.id AND t2.col1 = 14)
The above query will not select a row from table1 if there is a single correlated row in table2 having col1 = 14.
Output:
id key col1
-------------
2 2 d
3 3 s
4 4 xyz
If you want to return records that, in addition to the criterion set above, also have correlated records in table2, then you can use the following query:
SELECT t1.id, MAX(t1.[key]) AS [key], MAX(t1.col1) AS col1
FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.foreignkey
GROUP BY t1.id
HAVING COUNT(CASE WHEN t2.col1 = 14 THEN 1 END) = 0
Output:
id key col1
-------------
2 2 d
You can also achieve the same result with the second query using a combination of EXISTS and NOT EXISTS:
SELECT id, [key], col1
FROM table1 t1
WHERE EXISTS (SELECT 1
FROM table2 t2
WHERE t2.foreignkey = t1.id)
AND
NOT EXISTS (SELECT 1
FROM table2 t3
WHERE t3.foreignkey = t1.id AND t3.col1 = 14)
select t1.id,t1.key,
(select ROW_NUMBER() OVER(PARTITION BY col1 ORDER BY col1 DESC) AS Row,* into
#Temp from table1)
from table1 t1
inner join table2 t2 on t1.id=t2.foreignkey
where t2.col1=(select col1 from #temp where row>1)

How do I find groups of rows where all rows in each group have a specific column value

Sample data:
ID1 ID2 Num Type
---------------------
1 1 1 'A'
1 1 2 'A'
1 2 3 'A'
1 2 4 'A'
2 1 1 'A'
2 2 1 'B'
3 1 1 'A'
3 2 1 'A'
Desired result:
ID1 ID2
---------
1 1
1 2
3 1
3 2
Notice that I'm grouping by ID1 and ID2, but not Num, and that I'm looking specifically for groups where Type = 'A'. I know it's doable through a join two queries on the same table: one query to find all groups that have a distinct Type, and another query to filter rows with Type = 'A'. But I was wondering if this can be done in a more efficient way.
I'm using SQL Server 2008, and my current query is:
SELECT ID1, ID2
FROM (
SELECT ID1, ID2
FROM T
GROUP BY ID1, ID2
HAVING COUNT( DISTINCT Type ) = 1
) AS SingleType
INNER JOIN (
SELECT ID1, ID2
FROM T
WHERE Type = 'A'
GROUP BY ID1, ID2
) AS TypeA ON
TypeA.ID1 = SingleType.ID1 AND
TypeA.ID2 = SingleType.ID2
EDIT: Updated sample data and query to indicate that I'm grouping on two columns, not just one.
SELECT ID1, ID2
FROM MyTable
GROUP BY ID1, ID2
HAVING COUNT(Type) = SUM(CASE WHEN Type = 'A' THEN 1 ELSE 0 END)
There are two alternatives that don't require the aggregation (but do require distinct)
ANTI-JOIN
SELECT DISTINCT t1.ID1, t1.ID2
FROM
table t1
LEFT JOIN table t2
ON t1.ID1 = t2.ID1
and t1.Type <> t2.Type
WHERE
t1.Type = 'A'
AND
t2.ID1 IS NULL
See it working at this data.se query Sample for 9132209 (Anti-Join)
NOT EXISTS
SELECT DISTINCT t1.ID1, t1.ID2
FROM
table t1
WHERE
t1.Type = 'A'
AND
NOT EXISTS
(SELECT 1
FROM table t2
WHERE t1.ID1 = t2.ID1 AND Type <> 'A')
See it working at this data.se query Sample for 9132209 Not Exists