Using AND in an INNER JOIN - sql

I am fairly new with SQL would like to understand the logic below.
SELECT *
FROM Table A A1
INNER JOIN TABLE B B1 ON B1.ID = A1.ID AND A1 = 'TASK';
Not sure if this is a clear detail but please let me know. Thanks!

SELECT *
FROM Table A A1
INNER JOIN TABLE B B1 ON B1.ID = A1.ID AND A1.Column = 'TASK'
is the same as
SELECT *
FROM Table A A1
INNER JOIN TABLE B B1 ON B1.ID = A1.ID
WHERE A1.Column = 'TASK'
It's even the same performance wise, it's just a different way to write the query. In very large queries it can be more readable to use an AND directly on an INNER JOIN instead of "hiding" it the in the WHERE part.

This wouldn't run at all
SELECT *
FROM Table A A1 INNER JOIN
TABLE B B1
ON B1.ID = A1.ID AND A1 = 'TASK';
This will run because I added a column name (SomeColumn):
SELECT *
FROM Table A A1 INNER JOIN
TABLE B B1
ON B1.ID = A1.ID AND A1.SomeColumn = 'TASK';
And is the same as this
SELECT *
FROM Table A A1 INNER JOIN
TABLE B B1
ON B1.ID = A1.ID
WHERE A1.SomeCoumn = 'TASK';
Whenever you join to a constant it is pretty much the same as adding an additional criterion to the where clause. The only reason to put it up with the join is for code clarity.

SELECT * -- Select all the columns
FROM TABLE A A1 -- From the table A. A1 is like a nickname you are giving table A. Instead of typing A.ColumnName (A couldbe a very long name) you just type A1.ColumnName
INNER JOIN TABLE B B1 -- You are inner joining Table A and B. Again, B1 is just a nickname. Here is a good picture explaning joins.
ON B1.ID = A1.ID -- This is the column that the 2 tables have in common (the relationship column) These need to contain the same data.
AND A1 = 'TASK' -- This is saying you are joining where A1 tablename

Related

Left join inside left join

I have problem getting values from tables.
I need something like this
A.Id a1
B.Id b1
C.Id c1
B.Id b2
C.Id c2
C.Id c3
C.Id c4
Table A and B are joined together and also table B and C.
Table A can have one/zero or more values from table B. Same situation is for values from table C.
I need to perform left join on table A over table B and inside that left join on table B over table C.
I tried with left join from table A and B, but don't know how to perform left join inside that left join.
Is that possible? What would syntax for that look like?
edit:
Data would look like this
ZZN1 P1 NULL
ZZN1 P2 NAB1
ZZN2 P3 NAB2
ZZN2 P3 NAB3
No need to nest the left joins, you can simply flatten them and let your RDMBS handle the logic.
Sample schema:
a(id)
b(id, aid) -- aid is a foreign key to a(id)
c(id, bid) -- bid is a foreign key to b(id)
Query:
select a.id, b.id, c.id
from a
left join b on b.aid = a.id
left join c on c.bid = b.id
If the first left join does not succeed, then the second one cannot be performed either, since joining column b.id will be null. On the other hand, if the first left join succeeds, then the second one may, or may not succeed, depending if the relevant bid is available in c.
SELECT A.Name, B.Name , C.Name
FROM A
LEFT JOIN B ON A.id = B.id
LEFT JOIN C ON B.id = C.id

Choosing MAX values by the "where=" statement

Suppose, i have table A with columns a1, a2 and B table with b1, b2.
I need to join them like this
proc sql;
create C as
select a1, b1
from A as t1
left join B( where=(b1=max(select b1 from B)) as t2
on t1.a2 = t2.b2
run;
The problem is in where=(a1=max(select a1 from A)). It doesn't work somewhy. I need a where= solution, because B is big and where= is really fast
Your condition is on the first table. Hence, in a left join, such a condition usually goes in the where clause. Conditions on the second table would go in the on clause.
One method of doing what you want is to use a subquery:
proc sql;
create C as
select a1, b1
from A t1 left join
B t2
on t1.a2 = t2.b2
where t1.a1 = (select max(tt1.a1) from A tt1)
run;
It seems you only got the syntax wrong. This gets you the B record where b2 matches a2 and b1 is the maximum b1 value in the table.
create table c as
select a.a1, b.b1
from a
left join b on b.b2 = a.a2
and b.b1 = (select max(b1) from b);
Or are you simply trying to get the maximum b1 from all B records where b2 matches a2? That would be:
create table c as
select a.a1, max(b.b1)
from a
left join b on b.b2 = a.a2
group by a.a1;

Select full outer join from many-to-many relationships

I am trying to do something in MSSQL which I suppose is a fairly simple and common thing in any database with many-to-many relationships. However I seem to always end up with a quite complicated select query, I seem to be repeating the same conditions several times to get the desired output.
The scenario is like this. I have 2 tables (table A and B) and a cross table with foreign keys to the ID columns of A and B. There can only be one unique pair of As and Bs in the crosstable (I guess the 2 foreign keys make up a primary key in the cross table ?!?). Data in the three tables could look like this:
TABLE A TABLE B TABLE AB
ID Type ID Type AID BID
--------------------------------------------------
R Up 1 IN R 3
S DOWN 2 IN T 3
T UP 3 OUT T 5
X UP 4 OUT Z 6
Y DOWN 5 IN
Z UP 6 OUT
Now let's say I select all rows in A of type UP and all rows in B of type OUT:
SELECT ID FROM A AS A1
WHERE Type = 'UP'
(Result: R, T, X, Z)
SELECT ID FROM B AS B1
WHERE Type = 'OUT'
(Result: 3, 4, 6)
What I want now is to fully outer join these 2 sub queries based on the relations listed in AB. Hence I want all IDs in A1 and B1 to be listed at least once:
A.ID B.ID
R 3
T 3
null 4
X null
Z 6
From this results set I want to be able to see:
- Which rows in A1 does not relate to any rows in B1
- Which rows in B1 does not relate to any rows in A1
- Relations between rows in A1 and B1
I have tried a couple of things such as:
SELECT A1.ID, B1.ID
FROM (
SELECT * FROM A
WHERE Type = 'UP') AS A1
FULL OUTER JOIN AB ON
A1.ID = AB.AID
FULL OUTER JOIN (
SELECT * FROM B
WHERE Type = 'OUT') AS B1
ON AB.BID = B1.ID
This doesn't work, since some of the relations listed in AB are between rows in A1 and rows NOT IN B1 OR between rows in B1 but NOT IN A1.
In other words - I seem to be forced to create a subquery for the AB table also:
SELECT A1.ID, B1.ID
FROM (
SELECT * FROM A
WHERE Type = 'UP') AS A1
FULL OUTER JOIN (
SELECT * FROM AB AS AB1
WHERE
AID IN (SELECT ID FROM A WHERE type = 'UP') AND
BID IN (SELECT ID FROM B WHERE type = 'OUT')
) AS AB1 ON
A1.ID = AB1.AID
FULL OUTER JOIN (
SELECT * FROM B
WHERE Type = 'OUT') AS B1
ON AB1.BID = B1.ID
That just seems like a rather complicated solution for a seemingly simply problem. Especially when you consider that for A1 and B1 subqueries with more (complex) conditions - possible involving joins to other tables (one-to-many) would require the same temporary joins and conditions to be repeated in the AB1 subquery.
I am thinking that there must be an obvious way to rewrite the above select statements in order to avoid having to repeat the same conditions several times. The solution is probably right there in front me, but I just can't see it.
Any help would be appreciated.
I think you could employ a CTE in this case, like this:
;WITH cte AS (
SELECT A.ID AS AID, A.Type AS AType, B.ID AS BID, B.Type AS BType
FROM A FULL OUTER JOIN AB ON A.ID = AB.AID
FULL OUTER JOIN B ON B.ID = AB.BID)
SELECT AID, BID FROM CTE WHERE AType = 'UP' OR BType = 'OUT'
The advantage of using a CTE is that it will be compiled once. Then you can add additional criteria to the WHERE clause outside the CTE
Check this SQL Fiddle

SQL where clause for left outer join

I have a problem with a view I want to create. I have two tables joined in a left outer join, say tableA and tableB, where tableB is left outer joined.
I want to select only those rows from table B where state equals 4, so I add WHERE state = 4 to my query. Now the result set is trimmed quite a bit because all rows without a matching row in tableB are removed from the result (since state isn't 4 for those rows). I also tried WHERE state = 4 OR state IS NULL, doesn't work either (since state technically isn't NULL when there is no state).
So what I need is a WHERE statement which is only evaluated when there actually is a row, does such a thing exist?
If not I see two options: join (SELECT * FROM tableB WHERE state = 4) instead of table B, or create a view with the same WHERE statement and join that instead. What's the best option performance wise?
This is SQL Server 2008 R2 by the way.
You put the conditions in the on clause. Example:
select a.this, b.that
from TableA a
left join TableB b on b.id = a.id and b.State = 4
You can add state = 4 to the join condition.
select *
from T1
left outer join T2
on T1.T1ID = T2.T1ID and
T2.state = 4
Even easier than a subquery is expanding the on clause, like;
select *
from TableA a
left join
TableB b
on a.b_id = b.id
and b.state = 4
All rows from TableA will appear, and only those from TableB with state 4.
SQL Server will probably execute the view, expanded on, and subquery in exactly the same way. So performance wise, there should be little difference.
Alternative approach: (1) inner join to table B where state equals 4, (2) antijoin to table B to find rows that don't exist, (3) union the results:
SELECT A1.ID, A1.colA, B1.ColB
FROM tableA AS A1
INNER JOIN tableB AS B1
ON A1.ID = B1.ID
AND B1.state = 4
UNION
SELECT A1.ID, A1.colA, '{{MISSING}}' AS ColB
FROM tableA AS A1
WHERE NOT EXISTS (
SELECT *
FROM tableB AS B1
WHERE A1.ID = B1.ID
);
Alternatively:
SELECT A1.ID, A1.colA, B1.ColB
FROM tableA AS A1
JOIN tableB AS B1
ON A1.ID = B1.ID
AND B1.state = 4
UNION
SELECT ID, colA, '{{NA}}' AS ColB
FROM tableA
WHERE ID IN (
SELECT ID
FROM tableA
EXCEPT
SELECT ID
FROM tableB
);

filter duplicates in SQL join

When using a SQL join, is it possible to keep only rows that have a single row for the left table?
For example:
select * from A, B where A.id = B.a_id;
a1 b1
a2 b1
a2 b2
In this case, I want to remove all except the first row, where a single row from A matched exactly 1 row from B.
I'm using MySQL.
This should work in MySQL:
select * from A, B where A.id = B.a_id GROUP BY A.id HAVING COUNT(*) = 1;
For those of you not using MySQL, you will need to use aggregate functions (like min() or max()) on all the columns (except A.id) so your database engine doesn't complain.
It helps if you specify the keys of your tables when asking a question such as this. It isn't obvious from your example what the key of B might be (assuming it has one).
Here's a possible solution assuming that ID is a candidate key of table B.
SELECT *
FROM A, B
WHERE B.id =
(SELECT MIN(B.id)
FROM B
WHERE A.id = B.a_id);
First, I would recommend using the JOIN syntax instead of the outdated syntax of separating tables by commas. Second, if A.id is the primary key of the table A, then you need only inspect table B for duplicates:
Select ...
From A
Join B
On B.a_id = A.id
Where Exists (
Select 1
From B B2
Where B2.a_id = A.id
Having Count(*) = 1
)
This avoids the cost of counting matching rows, which can be expensive for large tables.
As usual, when comparing various possible solutions, benchmarking / comparing the execution plans is suggested.
select
*
from
A
join B on A.id = B.a_id
where
not exists (
select
1
from
B B2
where
A.id = b2.a_id
and b2.id != b.id
)