How to select rows from one table, when join matches multiple rows?

How to select rows from one table, when join matches multiple rows? - sql

The scenario is as follows: I have these two tables:
(TABLE1)
SUPER_ID| NAME |
-------+--
1 | BOB |
(TABLE2)
ID| SUPER_ID |
-------+----+
1 | 1 |
2 | 1 |
3 | 1 |
If I join these two tables as
`SELECT a.super_id, a.name
FROM TABLE1 a LEFT OUTER JOIN TABLE2 b ON a.super_id = b.super_id
WHERE a.super_id = 1`
The result will be
SUPER_ID| NAME |
-------+--------
1 | BOB |
1 | BOB |
1 | BOB |
How can I select only the rows from TABLE1 without using a GROUP BY? Thanks
UPDATE: Ok, so I have a 3rd table...
(TABLE3)
ID| TYPE |
-------+----+
1 | A |
2 | B |
3 | C |
which I need to Join to TABLE2 AS:
SELECT a.super_id, a.name
FROM TABLE1 a INNER JOIN
TABLE2 b ON a.super_id = b.super_id INNER JOIN
TABLE3 c ON b.id = c.id
WHERE a.super_id = 1

by restricting the join based on some other [unique] criterion. And the SQL syntax of course depends on what other criterion you choose. Say you want the latest record entered, If the table has a timestamp column you could do this:
SELECT a.super_id, a.name
FrOM TABLE1 a LEFT JOIN TABLE2 b
ON b.super_id = a.super_id
and b.timestamp = (Select Max(timestamp)
From TABLE2
Where super_id = a.super_id)
WHERE a.super_id = 1
If you don't have a timestamp, but you have a unique index or key (looks like id is such), you could use that:
SELECT a.super_id, a.name
FrOM TABLE1 a LEFT JOIN TABLE2 b
ON b.super_id = a.super_id
and b.id= (Select Max(id)
From TABLE2
Where super_id = a.super_id)
WHERE a.super_id = 1

There are many ways to do it.
I assume you want rows from table1 if you have super_id in table2.
You can use EXISTS
SELECT a.super_id, a.name
FROM TABLE1 a
WHERE EXISTS ( SELECT NULL FROM table2 b WHERE a.super_id = b.super_id )
AND a.super_id = 1
UPDATE
You can use GROUP BY into a WITH clause.
WITH single_b AS ( SELECT super_id, name
FROM table2
GROUP BY super_id, name )
SELECT a.super_id, a.name
FROM TABLE1 a INNER JOIN single_b b ON a.super_id = b.super_id
INNER JOIN TABLE3 c ON b.super_id = c.id
WHERE a.super_id = 1

Related

Joining tables based on the maximum id

I have found three questions which all seem to ask a similar question:
Getting max value from rows and joining to another table
Select only rows by join tables max value
Joining tables based on the maximum value
But I'm having a hard time wrapping my head around how exactly to join tables keeping only the maximum row of one of the tables when the maximum is in the id or index field itself.
I am looking for an answers that only require joins because this will allow the solution to work in a tool which generates queries for which it is easy to get it to generate the corresponding joins, although sub-queries are probably doable as well with a bit more effort. I found the answer below to be of particular interest:
SELECT DISTINCT b.id, b.filename, a1.name
FROM a a1
JOIN b
ON b.id = a1.id
LEFT JOIN a a2
ON a2.id = a1.id
AND a2.rank > a1.rank
WHERE a2.id IS NULL
However, in my case the ranking column is also the index, e.g. "id". I cannot compare for equality and greater than at the same time, because they will never be true at the same time!
Also, potentially complicating the situation is that a typical query in which I have need of this may join several tables (3-5 is not uncommon). So as a simplified example of my query:
SELECT
table1.field1, table1.field2, table1.field3,
table2.field1, table2.field2, table2.field3,
table3.field1, table3.field2, table3.field3,
table4.field1, table4.field2, table4.field3
FROM table1
INNER JOIN table2 ON
table1.field1 = table2.field1
AND table1.field2 = table2.field2
AND table2.field3 < 0
INNER JOIN table3 ON
table2.field1 = table3.field1
AND table2.field4 = table3.field4
INNER JOIN table4 ON
table1.field1 = table4.field1
AND table1.field2 = table4.field2
And what I want to do is to eliminate duplicates in table3 by only getting the row with the maximum id (e.g. MAX(table3.id)) for each unique combination of all the other fields. That is to say, the above query is returning something like this:
+-------+-------+-------+---------+
| table1| table2| table4|table3 |
+-------+-------+-------+---------+
| A | A | A | 1,... |
| A | A | A | 2,... |
| A | A | A | 3,... |
| A | A | A | MAX2,...|
| B | B | B | 1,... |
| B | B | B | 2,... |
| B | B | B | 3,... |
| B | B | B | MAX2,...|
+-------+-------+-------+---------+
(I'm just using A and B to denote that I'm talking about all the same values for the fields in table1, table2, and table4 for a particular set of rows.)
and I want to reduce it to this:
+-------+-------+-------+---------+
| table1| table2| table4|table3 |
+-------+-------+-------+---------+
| A | A | A | MAX1,...|
| B | B | B | MAX2,...|
+-------+-------+-------+---------+

You can add a derived table to reduce the matching rows in TABLE3 to one per group. Another method would use a window function but you asked for a JOIN only
SELECT
table1.field1, table1.field2, table1.field3,
table2.field1, table2.field2, table2.field3,
table3.field1, table3.field2, table3.field3,
table4.field1, table4.field2, table4.field3
FROM table1
INNER JOIN table2 ON
table1.field1 = table2.field1
AND table1.field2 = table2.field2
AND table2.field3 < 0
INNER JOIN table3 ON
table2.field1 = table3.field1
AND table2.field4 = table3.field4
--here is the added derived table. Change column names as needed
INNER JOIN (select UID, ID = max(ID) from Table3 group by UID) x
on x.UID = table3.UID and x.mx = table3.ID
INNER JOIN table4 ON
table1.field1 = table4.field1
AND table1.field2 = table4.field2
Or, perhaps... something like below. It really depends on your schema and that's hard to understand with the sample data.
INNER JOIN (select field1, field4, mx = max(ID) from Table3 group by field1, field4) x
on x.field1 = table3.field1 and x.field4 = table3.field4 and x.mx = table3.ID
Here is an example. You'll notice that the last three column pairs are identical. You only want the last one, which is the max(id) for that grouping. What ever makes a row unique in relation to the rest of your data (not your primary key, but what you are joining with) is what you'd want to include int he derived table and join condition.
declare #table table (id int identity(1,1), f1 char(1), f2 char(1))
insert into #table
values
('a','b'),
('a','c'),
('a','a'),
('b','b'),
('b','b'),
('b','b')
select * from #table
select t1.*
from #table t1
inner join
(select f1, f2, mx = max(id) from #table group by f1, f2) t2 on
t1.f1 = t2.f1
and t1.f2 = t2.f2
and t1.id = t2.mx

SQL filter rows based without using Group by

I have a query which will perform joins over 6 tables and fetches various columns based on a condition. I want to add an extra filter condition which will give me only those members who have a count(distinct dateCaptured)>30. I'm able to get the list of members who satisfy this condition using Group by and having. But I don't want to group by other column names because of this one condition. Do I need to use PARTITION BY in this case.
Sample TABLE a
+-----+------------+--------------+
| Id | Identifier | DateCaptured |
+-----+------------+--------------+
| 1 | 05548 | 2017-09-01 |
| 2 | 05548 | 2017-09-01 |
| 3 | 05548 | 2017-09-01 |
| 4 | 05548 | 2017-09-02 |
| 5 | 05548 | 2017-09-03 |
| 6 | 05548 | 2017-09-04 |
| 7 | 37348 | 2017-08-15 |
| 8 | 37348 | 2017-08-15 |
| . | | |
| . | | |
| . | | |
| 54 | 37348 | 2017-10-15 |
+-----+------------+--------------+
Query
SELECT a.value,
b.value, c.value,
d.value
FROM Table a
INNER JOIN Table b on a.Id=b.id
INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
INNER JOIN Table d on a.Id=d.Id
Assume Table a has more than 30 records for Identifier 37348. How can I get only this Identifier for the above query.
These are the patients i'm interested in for the above SELECT.
SELECT a.Identifier,count(DISTINCT DateCaptured)
FROM Table a
INNER JOIN Table b on a.Id=b.id
INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
INNER JOIN Table d on a.Id=d.Id
GROUP BY Identifier
HAVING count(DISTINCT DateCaptured)>30

WITH cte as (
SELECT a.Identifier
FROM Table a
INNER JOIN Table b on a.Id=b.id
INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
INNER JOIN Table d on a.Id=d.Id
GROUP BY Identifier
HAVING count(DISTINCT DateCaptured) > 30
)
SELECT a.value,
b.value, c.value,
d.value
FROM Table a
INNER JOIN Table b on a.Id=b.id
INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
INNER JOIN Table d on a.Id=d.Id
INNER JOIN cte on cte.Identifier = a.Identifier

SELECT a.value,
b.value, c.value,
d.value
FROM Table a
INNER JOIN Table b on a.Id=b.id
INNER JOIN Table c on a.Id=c.Id and s.Invalid=0
INNER JOIN Table d on a.Id=d.Id
WHERE a.Identifier IN (SELECT a1.Identifier
FROM Table a1
GROUP BY a1.Identifier HAVING count(DISTINCT a1.DateCaptured)>30)

If the multiple rows really are in tableA, then you can do:
SELECT a.value, b.value, c.value, d.value
FROM (SELECT a.*, COUNT(*) OVER (PARTITION BY id) as cnt
FROM a
) a INNER JOIN
b
ON a.Id = b.id INNER JOIN
c
ON a.Id = c.Id AND s.Invalid = 0 INNER JOIN
d
ON a.Id = d.Id
WHERE a.cnt > 30;
Note: If you still need count(distinct) you can do:
SELECT a.value, b.value, c.value, d.value
FROM (SELECT a.*, SUM(CASE WHEN seqnum = 1 THEN 1 ELSE 0 END) OVER (PARTITION BY id) as cnt
FROM (SELECT a.*, ROW_NUMBER() OVER (PARTITION BY id ORDER BY DateCaptured) as seqnum
FROM a
) a
) a INNER JOIN
b
ON a.Id = b.id INNER JOIN
c
ON a.Id = c.Id AND s.Invalid = 0 INNER JOIN
d
ON a.Id = d.Id
WHERE a.cnt > 30;

Get column name from one table by id in another table

I have 2 tables
Table 1:
Query_code | Item_code | Column_Name
2 | 1 | CN1
2 | 2 | CN2
2 | 3 | CN3
Table 2:
Query_code | Source_item| dest_item | pair_code
2 | 1 | 2 | 1
2 | 2 | 3 | 2
What i want to achive is to get source_item-dest_item as result.
According to data that will be:
CN1-CN2
CN2-CN3
What i tried is:
SELECT A.Column_Name
FROM TABLE1 A inner join
TABLE2 B
ON A.QUERY_CODE=B.QUERY_CODE
But this is not even close to my goal

What you need to do is use TABLE2 to identify the source_item and dest_item, then join with TABLE1 the first time to replace source_item with the column name, and join again with TABLE1 to replace dest_item with the other column name.
SELECT A.Column_Name, B.Column_Name
FROM t2 C
LEFT JOIN t1 A
ON C.Source_item=A.Item_code
LEFT JOIN t1 B
ON C.Dest_item=B.Item_code
WHERE C.Query_code=A.Query_code
AND C.Query_code=B.Query_code
Running Example on SQLFiddle

This should work. It is unclear what your Query_Code is meant to do, so I omitted it from the query.
EDIT Inserted Query_code condition as well.
SELECT
Source.Column_Name || '-' || Dest.Column_Name AS ResultPair
FROM
TABLE2 B
INNER JOIN TABLE1 Source
ON B.source_item = Source.item_code AND B.Query_code = Source.Query_code
INNER JOIN TABLE1 Dest
ON B.dest_item = Dest.item_code AND B.Query_code = Dest.Query_code;

Here you go
WITH table1 (query_code, item_code ,column_name ) AS
(SELECT 2,1,'cn1' UNION ALL
SELECT 2,2,'cn2' UNION ALL
SELECT 2,3,'cn3'),
table2 (query_code , source_item, dest_item , pair_code) AS
(SELECT 2,1,2,1 UNION ALL
SELECT 2,2,3,2)
SELECT a.column_name || '-' || c.column_name
FROM table1 a
INNER JOIN table2 b ON a.item_code=b.source_item
INNER JOIN table1 c ON c.item_code=b.dest_item;

Simple SQL Questions with join

i have some problems with join tables:
Table A -> ID,Col1,Col2,Col3
Table B -> Rank , ColX , A_ID (Relationship with A.ID)
I want to take higher Rank (each A_ID , like group by A_ID) of B table
my results must be something like A.ID , Col1 , Col2 , Col3 , ""ColX"" , how can i do that ?
and i want my result count equals to A.ID count.
TableA
+--------------------+
| ID|Col1|Col2|Col3| |
+--------------------+
| 1 | C1 | C2 | C3 |
| 2 | C1 | C2 | C3 |
+--------------------+
TABLE_B
+-----------------------------+
| ID| COL_X |RANK |A_ID| |
+-----------------------------+
| 1 | SomeValue | 1 | 1 |
| 2 | some22222 | 2 | 1 |
| 3 | SOMEXXXX | 3 | 1 |
| 4 | SOMEVAL | 1 | 2 |
| 5 | VALUE | 2 | 2 |
+-----------------------------+
Expected Output:
+--------------------------------------------------------------------+
| ID| Col1| Col2 | Col3| COLX |
+--------------------------------------------------------------------+
| 1 | C1 | C2 | C3 | SOMEXXXX (Higher Rank of TableB-> A_ID = 1) |
| 2 | C1 | C2 | C3 | VALUE (Higher Rank of TableB-> A_ID = 2) |
+--------------------------------------------------------------------+

You could easily do this using a subquery by first finding the max for each A_ID and then joining to tableA and TableB to get your desired rows:
SELECT a.ID,
a.col1,
a.Col2,
a.Col3,
b1.Col_X
FROM (
SELECT a_id
,max(rank) AS MaxRank
FROM tableb
GROUP BY a_id
) b
INNER JOIN tablea a ON a.id = b.a_id
INNER JOIN tableb b1 ON b.a_id = b1.a_id AND b1.rank = b.MaxRank
ORDER BY a.ID;
SQL Fiddle Demo

I'm thinking you want to take the max rank from your table b for each row in table a?
There's lots of different ways of approaching this. Here's one simple one:
with maxCTE as
(select
a_id,
max(rank) as MaxRank
from
tableb
group by
a_id
)
select
*
from
tablea a
inner join tableb b
on a.id = b.a_id
inner join maxcte c
on b.a_id = c.a_id
and b.rank = c.MaxRank
SQLFiddle
Basically, the CTE identifies the max rank for each a_id, then we join that back to tableb to get the details about that row.

with x as
(select a_id, max(rank) as mxrnk
from tableB
group by a_id)
select a.id, a.col1, a.col2, a.col3, b.col_x
from tableA a join x
on a.id = x.a_id
join tableB b
on x.mxrnk = b.rank
You can select max rank per a_id in the cte and use it to select the corresponding columns.

One is to INNER JOIN Table B onto Table A by ID's. You will have 3 records returned from Table B. If you ORDER those records by the COLX
SELECT
,a.ID
,a.Col1
,a.Col2
,a.Col3
,b.COLX
FROM TableA AS a
INNER JOIN TABLE_B AS b on b.A_ID = a.id
ORDER BY b.COLX DESC
Then another way is joining a sub query of Table B that also has a sub query that filters Table B records to only the records with the highest RANK.
That way you can bring in COLX values from the highest RANK records from Table B that match the records of Table A.
I think at least...
SELECT
a.ID
,a.Col1
,a.Col2
,a.Col3
,b.COLX
FROM TableA a
INNER JOIN (
SELECT
a.A_ID
,a.RANK
,a.COLX
FROM TABLE_B a
INNER JOIN (
SELECT
A_ID,
,MAX(RANK) AS [RANK] -- Highest Rank
FROM TABLE_B
GROUP BY A_ID
) AS b ON b.A_ID = a.A_ID AND b.RANK = a.RANK
) AS b on b.A_ID = a.id
ORDER BY a.ID ASC

Select A.*,D.Col_X
from
(Select C.COL_X,B.A_ID
from
(Select A_ID,MAX(rank) as MAX_rank
from TABLE_B
group by A_ID) B ----- gets the highest rank and ID of the highest rank
inner join TABLE_B c
on
concat(C.A_ID,C.RANK)= concat(B.A_ID,B.MAX_rank)) D ---- Gets the highest rank name
inner join TABLE_A A
on D.A_ID=A.ID
OUTPUT:
ID Col1 Col2 Col3 Col_X
1 c1 c2 c3 SOMEXXXX
2 c1 c2 c3 VALUE

SQL Query to get missing staff member

I have following problem to find out if a User with a specific job role is missing in one project:
Table 1:
ID | Project_ID
---+------------
1 | 11A
1 | 11B
1 | 11C
2 | 12B
2 | 12C
3 | 13A
3 | 13C
Table 2:
Project_ID | JobRole_ID
-------------+------------
11A | A
11B | B
11C | C
12B | B
12C | C
13A | A
13C | C
Table 3:
JobRole_ID | JobRole
-----------+---------
A | Manager
B | Project Leader
C | Project Assistent
For each project jobrole A,B and C are required (Table 3). Table 2 only contains added JobRoles, not missing ones.
What i expect is:
ID | JobRole
---+---------
1 | Manager
2 | NULL
3 | Manager
Please help me! Thx

You want to find Projects where exists a JobRole which is not tied to this particular project.
From top of my head:
--try to find projects
SELECT
T1.Id
FROM
Table1 AS T1
WHERE
--with at least one role
EXISTS(
SELECT
*
FROM
Table3 AS T3
WHERE
--where mapping does not exist
NOT EXISTS(
SELECT
*
FROM
Table2 AS T2
WHERE
T2.Project_ID = T1.Project_ID AND
T2.JobRole_ID = T3.JobRole_ID
)
)
EDIT:
Could this be what you want? If not, could you please give us more details?
SELECT
T1.Id, T3.JobRole_ID
FROM
Table1 AS T1 LEFT JOIN
Table2 AS T2 ON T2.Project_ID = T1.Project_ID LEFT JOIN
Table3 AS T3 ON T3.JobRole_ID = T2.JobRole_ID

Based on your updated requirements this should work, although I'm sure it can be simplified.
select a.ID, b.JobRole
from (
select * from Table1, table3 where JobRole_ID = 'A'
) a left join (
select t1.id, t3.jobrole
from table1 t1
left join table2 t2 on t2.project_id = t1.project_id
left join table3 t3 on t3.jobrole_id = t2.jobrole_id
) b on a.id = b.id and a.jobrole = b.jobrole
group by a.ID, b.JobRole

Here is a method:
select p.*, j.*
from projects p cross join
jobroles j left join
projectjobs pj
on p.project_id = pj.project_id and
j.jobrole_id = pj.jobrole_id
where pj.project_id is null;
It generates a list of all projects and all jobs (using the cross join). Then, using the left join and where clause, it filters out the ones that exist.
EDIT:
You can try with comma:
select p.*, j.*
from projects p,
jobroles j left join
projectjobs pj
on p.project_id = pj.project_id and
j.jobrole_id = pj.jobrole_id
where pj.project_id is null;
But this may not work because of the semantics of how the comma is parsed in the from clause. You might need a subquery:
select xpj.*
from (select p.*, j.*
from projects p, jobroles j
) xpj left join
projectjobs pj
on xpj.project_id = pj.project_id and
xpj.jobrole_id = pj.jobrole_id
where pj.project_id is null;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to select rows from one table, when join matches multiple rows? - sql

Related

Joining tables based on the maximum id

SQL filter rows based without using Group by

Get column name from one table by id in another table

Simple SQL Questions with join

SQL Query to get missing staff member

Categories

Resources