SQL Server : joins new syntax(ANSI vs. non-ANSI SQL JOIN syntax) - sql

I have tried to convert old MS sql join syntax to new join syntax but number of rows in the results not matching.
Original SQL:
select
b.Amount
from
TableA a, TableB b,TableC c, TableD d
where
a.inv_no *= b.inv_no and
a.inv_item *= b.inv_item and
c.currency *= b.cash_ccy and
d.tx_code *= b.cash_receipt
Converted SQL:
SELECT
b.AMOUNT
FROM
(TableA AS a
LEFT OUTER JOIN
TableB AS b ON a.INV_NO = b.INV_NO
AND a.inv_item = b.inv_item
LEFT OUTER JOIN
TableC AS c ON c.currency = b.cash_ccy)
LEFT OUTER JOIN
TableD as d ON d.tx_code = b.cash_receipt
Findings
Results are same on both original SQL and modified SQL upto joining of 3 tables but when joining the fourth table (TableD) to the modified SQL, the number of rows returned is different.

The order of fields within predicates is important when using SQL Server's (deprecated) proprietary ANSI 89 join syntax *= or =*
So while
SELECT *
FROM TableA AS A
LEFT JOIN TableB AS B
ON A.ColA = B.ColB;
Is exactly the same as
SELECT *
FROM TableA AS A
LEFT JOIN TableB AS B
ON B.ColB = A.ColA; -- NOTE ORDER HERE
The eqivalent
SELECT *
FROM TableA AS A, TableB AS b
WHERE A.ColA *= B.ColB;
Is not the same as
SELECT *
FROM TableA AS A, TableB AS b
WHERE B.ColA *= A.ColB;
This last query's ANSI 92 equivalent would be
SELECT *
FROM TableA AS A
RIGHT JOIN TableB AS B
ON A.ColA = B.ColB;
Or if you dislike RIGHT JOIN as much as I do you would probably write:
SELECT *
FROM TableB AS B
LEFT OUTER JOIN TableA AS A
ON B.ColB = A.ColA;
So actually the equivalent query in ANSI 92 join syntax would involve starting with TableA, TableC and TableD (since these are the leading fields in the original WHERE Clause). Then since there is no direct link between the three, you end up with a cross join
SELECT b.Amount
FROM TableA AS a
CROSS JOIN TableD AS d
CROSS JOIN TableC AS c
LEFT JOIN TableB AS B
ON c.currency = b.cash_ccy
AND d.tx_code = b.cash_receipt
AND a.INV_NO = b.INV_NO
AND a.inv_item = b.inv_item;
This is the equivalent rewrite, and explans the difference in the number of rows
WORKING EXAMPLE
Needs to be run on SQL Server 2008 or earlier with compatibility level 80 or less
-- SAMPLE DATA --
CREATE TABLE #TableA (Inv_No INT, Inv_item INT);
CREATE TABLE #TableB (Inv_No INT, Inv_item INT, cash_ccy INT, cash_receipt INT, Amount INT);
CREATE TABLE #TableC (currency INT);
CREATE TABLE #TableD (tx_code INT);
INSERT #TableA (inv_no, inv_item) VALUES (1, 1), (2, 2);
INSERT #TableB (inv_no, inv_item, cash_ccy, cash_receipt, Amount) VALUES (1, 1, 1, 1, 1), (2, 2, 2, 2, 2);
INSERT #TableC (currency) VALUES (1), (2), (3), (4);
INSERT #TableD (tx_code) VALUES (1), (2), (3), (4);
-- ORIGINAL QUERY(32 ROWS)
SELECT
b.Amount
FROM
#TableA a, #TableB b,#TableC c, #TableD d
WHERE
a.inv_no *= b.inv_no and
a.inv_item *= b.inv_item and
c.currency *= b.cash_ccy and
d.tx_code *= b.cash_receipt
-- INCORRECT ANSI 92 REWRITE (2 ROWS)
SELECT b.AMOUNT
FROM #TableA AS a
LEFT OUTER JOIN #TableB AS b
ON a.INV_NO = b.INV_NO
and a.inv_item = b.inv_item
LEFT OUTER JOIN #TableC AS c
ON c.currency = b.cash_ccy
LEFT OUTER JOIN #TableD as d
ON d.tx_code = b.cash_receipt;
-- CORRECT ANSI 92 REWRITE (32 ROWS)
SELECT b.Amount
FROM #TableA AS a
CROSS JOIN #TableD AS d
CROSS JOIN #TableC AS c
LEFT JOIN #TableB AS B
ON c.currency = b.cash_ccy
AND d.tx_code = b.cash_receipt
AND a.INV_NO = b.INV_NO
AND a.inv_item = b.inv_item;

Related

Linking multiple tables with an Oracle outer join

Struggling with Oracle outer join syntax.
We have this query with inner and outer joins;
SELECT A.aa, B.bb, C.cc, D.dd
FROM
TABLEA A, TABLEB B, TABLEC C, TABLED D
WHERE
A.XX = B.XX AND
B.YY = C.YY AND
C.ZZ = D.WW (+)
The query works fine. A change is now it's possible that the link between table A and B (on XX) may not be present.
So we'd like to turn this into an outer join which returns data regardless of whether the existing joins are satisfied OR if there is no link between A and B (and the other tables).
How can you do this?
Say you have your tables like the following:
insert into tableA values (1);
insert into tableA values (2);
insert into tableB values ( 1, 10);
insert into tableB values ( -2, 20);
insert into tableC values ( 10, 100);
insert into tableC values ( 20, 200);
insert into tableD values ( 200);
insert into tableD values ( 999);
If I understand well, you need to use on outer join even on B and C, not only D; in the old Oracle syntax this is:
SELECT *
FROM
TABLEA A, TABLEB B, TABLEC C, TABLED D
WHERE
A.XX = B.XX(+) AND
B.YY = C.YY(+) AND
C.ZZ = D.WW (+)
And in (better) ANSI SQL:
select *
from tableA A
left outer join
tableB B on ( A.xx = B.xx)
left outer join
tableC C on ( B.yy = C.yy)
left outer join
tableD D on ( C.zz = D.ww)
They both give:
XX XX YY YY ZZ WW
---------- ---------- ---------- ---------- ---------- ----------
2
1 1 10 10 100

SQL join on three tables, lines that exist in 2 tables but not the third

Please I need your help.
Suppose that we have 3 tables A, B and C as shown in the image below:
I want to get lines in the table A that exist or not exist in table B, and lines in table C that exist or not exist in table B, using one sql request.
I have tried this but doesn't work :
SELECT A.ATS0804, C.ATS0207, A.ATS0959, A.ATS0802, B.ATS0827
FROM
ISUT183.ENS0042 B
RIGHT JOIN ISUT183.ENS0038 A
ON B.ENS0038K = A.ATS0804
RIGHT JOIN ISUT183.EN00041 C
ON B.EN00041K = C.AT02812
WHERE ( C.ATS0207 = '0001757430'
AND B.ATS0823 = '9999-01-01'
AND A.ATS0803 = '9999-01-01'
AND A.ATS0959 = '61384352001'
AND A.ATS0802 ='01.01.2010'
) ;
you can do a cross join too:
with AB as (
select * from A left outer join B on A.ID1=B.ID1
),
AC as (
select * from C left outer join B on C.ID2=B.ID2
)
select * from AB CROSS JOIN AC
use where exists and where not exists clauses
If you test equality into table B in where clause, the left outer join or right outer join dont take null
You dont have join between A and C, then you can do a UNION ALL
but you must take columns of same type in selects clause (ID1 same type of ID2)
select * from (
select 'A-B' typejoin, A.ID1 as IDA_OR_C, B.ID1 as IDB from A left outer join B on A.ID1=B.ID1
union all
select 'A-C' typejoin, C.ID2 as IDA_OR_C, B.ID2 as IDB from C left outer join B on C.ID2=B.ID2
) tmp
where ....

Differences between forms of LEFT JOIN

What is the difference between these query
SELECT
A.AnyField
FROM
A
LEFT JOIN B ON B.AId = A.Id
LEFT JOIN C ON B.CId = C.Id
and this another query
SELECT
A.AnyField
FROM
A
LEFT JOIN
(
B
JOIN C ON B.CId = C.Id
) ON B.AId = A.Id
Original answer:
They are not the same.
For example a left join b left join c will return a rows, plus b rows even if there are no c rows.
a left join (b join c) will never return b rows if there are no c rows.
Added later:
SQL>create table a (id int);
SQL>create table b (id int);
SQL>create table c (id int);
SQL>insert into a values (1);
SQL>insert into a values (2);
SQL>insert into b values (1);
SQL>insert into b values (1);
SQL>insert into c values (2);
SQL>insert into c values (2);
SQL>insert into c values (2);
SQL>insert into c values (2);
SQL>select a.id from a left join b on a.id = b.id left join c on b.id = c.id;
id
===========
1
1
2
3 rows found
SQL>select a.id from a left join (b join c on b.id = c.id) on a.id = b.id;
id
===========
1
2
2 rows found
The first query is going to take ALL records from table a and then only records from table b where a.id is equal to b.id. Then it's going to take all records from table c where the resulting records in table b have a cid that matches c.id.
The second query is going to first JOIN b and c on the id. That is, records will only make it to the resultset from that join where the b.CId and the c.ID are the same, because it's an INNER JOIN.
Then the result of the b INNER JOIN c will be LEFT JOINed to table a. That is, the DB will take all records from a and only the records from the results of b INNER JOIN c where a.id is equal to b.id
The difference is that you may end up with more data from b in your first query since the DB isn't dropping records from your result set just because b.cid <> c.id.
For a visual, the following Venn diagram shows which records are available
SELECT
A.AnyField
FROM
A
LEFT JOIN B ON B.AId = A.Id
LEFT JOIN C ON B.CId = C.Id
In this query you are LEFT JOINing C with B which will give you all records possible with B whether or not there is a match to any records in C.
SELECT
A.AnyField
FROM
A
LEFT JOIN
(
B
JOIN C ON B.CId = C.Id
) ON B.AId = A.Id
In this query you are INNER JOINing C with B which will result in matching B and C records.
Both queries will give you the same result set as you are only pulling records from A so you will not see what records had matches and what did not in regards to B and C.

How do JOINs distribute over OUTER JOINs?

Are there any case where these two are not equivalent?
A OUTER JOIN (B JOIN C)
A OUTER JOIN C OUTER JOIN B
Yes:
In your first example (A OUTERJOIN (B JOIN C)), if either B or C does not have a matching record, both B and C are omitted.
In your second example (A OUTERJOIN C OUTERJOIN B), C can be returned even if B does not have a matching record.
Still same as my comment. If B join C produces an empty result set, then A outer join "empty result set" is the same as just A.
A outer join B outer join C is something different (as least if one of B and C are not empty.)
Since A seems to connect to C, it would be clearer to write the first option as:
A OUTERJOIN (C JOIN B)
The key difference between your two options is whether data from C is returned when there is a match between A and C. Just looking at this aspect the two options could be viewed as the sets:
intersect(A,intersect(C,B))
intersect(A,C)
Clearly, the two are different since the first form can eliminate rows from C before it is intersected with A.
What fields you join on can make a difference as tables are not guaranteed to have only one possible field to join to another table on. And in this case does table b have a field to join to either table c or table a? That makes a differnce. What fields you want returned make a difference to the results set and whther two things will return the same results. The state of the data makes a differnce as some queries will appear to be equivalent until the data changes. SO understnding that these are not equivalent queires helps you avoid these mistakes. Whether you use a full, left or right outer join makes a difference as well. And finally what where clauses you add can make a differnce in whther they appear to be equivalent.
Check out these examples using temp tables (SQL server syntax)
create table #a (aid int, sometext varchar(50))
create table #b (bid int, sometext2 varchar(50), cid int, aid int)
create table #c (cid int, sometext3 varchar(50), aid int)
insert into #a
values(1, 'test') , (2, 'test2'), (3, 'test3')
insert into #b
values(1, 'test', 1, 2) , (2, 'test2', 2, 1), (3, 'test3', 2, 2)
insert into #c
values(1, 'test', 1) , (2, 'test2', 2), (3, 'test3', 1)
select *
from #a a
left outer join #c c on a.aid = c.aid
left outer join #b b on a.aid = b.aid
select *
from #a a
left outer join #c c on a.aid = c.aid
left outer join #b b on c.cid = b.cid
select *
from #a a
left outer join #b b
join #c c on b.cid = c.cid
on a.aid = b.aid
select *
from #a a
right outer join #c c on a.aid = c.aid
right outer join #b b on a.aid = b.aid
select *
from #a a
right outer join #c c on a.aid = c.aid
right outer join #b b on c.cid = b.cid
select *
from #a a
right outer join #b b
join #c c on b.cid = c.cid
on a.aid = b.aid
select *
from #a a
full outer join #c c on a.aid = c.aid
full outer join #b b on a.aid = b.aid
select *
from #a a
full outer join #c c on a.aid = c.aid
full outer join #b b on c.cid = b.cid
select *
from #a a
full outer join #b b
join #c c on b.cid = c.cid
on a.aid = b.aid

Order of Full Outer Joins Yields Different Number of Result Rows..Why?

I'm working with two tables A and B. Table A identifies equity securities and Table B has a number of details on the security.
For example, when B.Item = 5301, the row specifies price for a given security. When B.Item = 9999, the row specifies dividends for a given security. I am trying to get both price and dividends in the same row. In order to achieve this, I FULL JOINed table B twice to table A.
SELECT *
FROM a a
FULL JOIN (SELECT *
FROM b) b
ON b.code = a.code
AND b.item = 3501
FULL JOIN (SELECT *
FROM b) b2
ON b2.code = a.code
AND b.item = 9999
AND b2.year_ = b.year_
AND b.freq = b2.freq
AND b2.seq = b.seq
WHERE a.code IN ( 122514 )
The remaining fields in the join clause like Year_, Freq, and Seq just make sure the dates of the price and dividends match. A.Code simply identifies a single security.
My issue is that when I flip the order of the full joins I get a different number of results. So if b.Item = 9999 comes before b.Item 2501, I get one result. The other way around I get 2 results. I realized the table B has zero entries for security 122514 for dividend, but has two entries for price.
When price is specified first, I get both prices and dividend fields are null. However, when dividend is specified first, I get NULLs for the dividend fields and also nulls for the prices fields.
Why aren't the two price entries showing up? I would expect them to do so in a FULL JOIN
It's because your second FULL OUTER JOIN refers to your first FULL OUTER JOIN. This means changing the order of them is making a fundamental change to the query.
Here is some pseudo-SQL that demonstrates how this works:
DECLARE #a TABLE (Id INT, Name VARCHAR(50));
INSERT INTO #a VALUES (1, 'Dog Trades');
INSERT INTO #a VALUES (2, 'Cat Trades');
DECLARE #b TABLE (Id INT, ItemCode VARCHAR(1), PriceDate DATE, Price INT, DividendDate DATE, Dividend INT);
INSERT INTO #b VALUES (1, 'p', '20141001', 100, '20140101', 1000);
INSERT INTO #b VALUES (1, 'p', '20141002', 50, NULL, NULL);
INSERT INTO #b VALUES (2, 'c', '20141001', 10, '20141001', 500);
INSERT INTO #b VALUES (2, 'c', NULL, NULL, '20141002', 300);
--Same results
SELECT a.*, b1.*, b2.* FROM #a a FULL OUTER JOIN #b b1 ON b1.Id = a.Id AND b1.ItemCode = 'p' FULL OUTER JOIN #b b2 ON b2.Id = a.Id AND b2.ItemCode = 'c';
SELECT a.*, b2.*, b1.* FROM #a a FULL OUTER JOIN #b b1 ON b1.Id = a.Id AND b1.ItemCode = 'c' FULL OUTER JOIN #b b2 ON b2.Id = a.Id AND b2.ItemCode = 'p';
--Different results
SELECT a.*, b1.*, b2.* FROM #a a FULL OUTER JOIN #b b1 ON b1.Id = a.Id AND b1.ItemCode = 'p' FULL OUTER JOIN #b b2 ON b2.Id = a.Id AND b2.ItemCode = 'c' AND b2.DividendDate = b1.PriceDate;
SELECT a.*, b2.*, b1.* FROM #a a FULL OUTER JOIN #b b1 ON b1.Id = a.Id AND b1.ItemCode = 'c' FULL OUTER JOIN #b b2 ON b2.Id = a.Id AND b2.ItemCode = 'p' AND b2.DividendDate = b1.PriceDate;