I have the following SQL statement in a legacy system I'm refactoring. It is an abbreviated view for the purposes of this question, just returning count(*) for the time being.
SELECT COUNT(*)
FROM Table1
INNER JOIN Table2
INNER JOIN Table3 ON Table2.Key = Table3.Key AND Table2.Key2 = Table3.Key2
ON Table1.DifferentKey = Table3.DifferentKey
It is generating a very large number of records and killing the system, but could someone please explain the syntax? And can this be expressed in any other way?
Table1 contains 419 rows
Table2 contains 3374 rows
Table3 contains 28182 rows
EDIT:
Suggested reformat
SELECT COUNT(*)
FROM Table1
INNER JOIN Table3
ON Table1.DifferentKey = Table3.DifferentKey
INNER JOIN Table2
ON Table2.Key = Table3.Key AND Table2.Key2 = Table3.Key2
For readability, I restructured the query... starting with the apparent top-most level being Table1, which then ties to Table3, and then table3 ties to table2. Much easier to follow if you follow the chain of relationships.
Now, to answer your question. You are getting a large count as the result of a Cartesian product. For each record in Table1 that matches in Table3 you will have X * Y. Then, for each match between table3 and Table2 will have the same impact... Y * Z... So your result for just one possible ID in table 1 can have X * Y * Z records.
This is based on not knowing how the normalization or content is for your tables... if the key is a PRIMARY key or not..
Ex:
Table 1
DiffKey Other Val
1 X
1 Y
1 Z
Table 3
DiffKey Key Key2 Tbl3 Other
1 2 6 V
1 2 6 X
1 2 6 Y
1 2 6 Z
Table 2
Key Key2 Other Val
2 6 a
2 6 b
2 6 c
2 6 d
2 6 e
So, Table 1 joining to Table 3 will result (in this scenario) with 12 records (each in 1 joined with each in 3). Then, all that again times each matched record in table 2 (5 records)... total of 60 ( 3 tbl1 * 4 tbl3 * 5 tbl2 )count would be returned.
So, now, take that and expand based on your 1000's of records and you see how a messed-up structure could choke a cow (so-to-speak) and kill performance.
SELECT
COUNT(*)
FROM
Table1
INNER JOIN Table3
ON Table1.DifferentKey = Table3.DifferentKey
INNER JOIN Table2
ON Table3.Key =Table2.Key
AND Table3.Key2 = Table2.Key2
Since you've already received help on the query, I'll take a poke at your syntax question:
The first query employs some lesser-known ANSI SQL syntax which allows you to nest joins between the join and on clauses. This allows you to scope/tier your joins and probably opens up a host of other evil, arcane things.
Now, while a nested join cannot refer any higher in the join hierarchy than its immediate parent, joins above it or outside of its branch can refer to it... which is precisely what this ugly little guy is doing:
select
count(*)
from Table1 as t1
join Table2 as t2
join Table3 as t3
on t2.Key = t3.Key -- join #1
and t2.Key2 = t3.Key2
on t1.DifferentKey = t3.DifferentKey -- join #2
This looks a little confusing because join #2 is joining t1 to t2 without specifically referencing t2... however, it references t2 indirectly via t3 -as t3 is joined to t2 in join #1. While that may work, you may find the following a bit more (visually) linear and appealing:
select
count(*)
from Table1 as t1
join Table3 as t3
join Table2 as t2
on t2.Key = t3.Key -- join #1
and t2.Key2 = t3.Key2
on t1.DifferentKey = t3.DifferentKey -- join #2
Personally, I've found that nesting in this fashion keeps my statements tidy by outlining each tier of the relationship hierarchy. As a side note, you don't need to specify inner. join is implicitly inner unless explicitly marked otherwise.
Related
Is there a way to merge/combine the results from multiple joined tables into a single field?
Two key details:
Must not use a Union.
Must be able to generate a View with the code.
We have a nested table structure and there's quite a bit of conditional logic, so I'd like to avoid writing the same large query multiple times (hence why I'm trying to avoid a Union).
select [Combine Id results from all 4 tables]
from Table1 tbl1
inner join Table2 tbl2 on tbl1.Id = tbl2.ParentId
inner join Table3 tbl3 on tbl2.Id = tbl3.ParentId
inner join Table4 tbl4 on tbl3.Id = tbl4.ParentId
So if the tables contain the following data:
Table 1
Id, ParentId
1, 1
Table 2
Id, ParentId
2, 1
Table 3
Id, ParentId
3, 2
Table 4
Id, ParentId
4, 3
Is it possible to produce the following single-field output (list of integers) using the join structure from my original query?:
Id
1
2
3
4
I think apply does what you want:
select v.id
from Table1 tbl1 inner join
Table2 tbl2
on tbl1.Id = tbl2.ParentId inner join
Table3 tbl3
on tbl2.Id = tbl3.ParentId inner join
Table4 tbl4
on tbl3.Id = tbl4.ParentId cross apply
(values (tbl1.id), (tbl2.id), (tbl3.id), (tbl4.id)) v(id);
I have 2 select statements having a common column POL.SP_NUM which I wish to combine. I am new to SQL and haven't the slightest clue how to go about with the same.
Query 1:
select POL.SP_NUM POL#
, POL.ASSET_NUM COV#
, count(distinct(POLX.ATTRIB_06)) COUNT_ADDENDA
, count(distinct(POLX.ATTRIB_07)) COUNT_CERT
, sum(POL.QTY) SI
from S_ASSET POL
, S_ASSET_X POLX
Where POL.ROW_ID = POLX.ROW_ID
and POL.SP_NUM in ('000','111','222')
group by
POL.SP_NUM
, POL.ASSET_NUM
Query 1 output:
POL# COV# COUNT_ADDENDA COUNT_CERT SI
000 856 2 0 1000
111 123 0 0 500
222 567 0 1 2000
Query 2:
select POL#, sum(DOCI)
from (
select POL.SP_NUM POL#, sum(Q.AMT + POL.AMT) DOCI
from S_ASSET POL
, S_QUOTE_ITEM Q
where POL.X_QUOTE_ID = Q.ROW_ID
and POL.SP_NUM in ('000','111','222')
group by POL.SP_NUM
UNION ALL
select POL.SP_NUM POL#, sum(QXM.AMT) DOCI
from S_ASSET POL
, S_QUOTE_ITEM Q
, S_QUOTE_ITEM_XM QXM
where POL.X_QUOTE_ID = Q.ROW_ID
and Q.ROW_ID = QXM.PAR_ROW_ID
and POL.SP_NUM in ('000','111','222')
group by POL.SP_NUM
)
group by POL#
Query 2 output:
POL# sum(DOCI)
000 90
111 0
222 10
Desired output:
POL# COV# COUNT_ADDENDA COUNT_CERT SI sum(DOCI)
000 856 2 0 1000 90
111 123 0 0 500 0
222 567 0 1 2000 10
If there is a better way to code this? Suggestions are welcome.
This is no answer to the question, but an answer to the request to explain the join types made in the comments setion.
INNER JOIN (or short: JOIN)
select * from t1 join t2 on t1.colx = t2.coly
only gives you matches. This is the most common join. You could replace the ON clause with a USING clause in case the columns in the ON clause have the same names in the tables. Sometimes usefull to quickly write a query, but I would generally not recommend USING.
LEFT OUTER JOIN (or short: LEFT JOIN)
select * from t1 left join t2 on t1.colx = t2.coly
gives you all t1 records, no matter whether they have a math in t2. So when there is a match or more for a t1 record, then you join these just as wih an inner join, but when a t1 record has no match in t2 then you get the t1 record along with an empty t2 record (all columns are NULL, even the columns you used in the ON clause, which is t2.coly in above example). In other words: you get all records you'd get with an inner join plus all t1 records that have no match in t2.
You can also use a RIGHT JOIN so you'd keep t2 records when there is no t1 match:
select * from t1 right join t2 on t1.colx = t2.coly
but this is regarded less readable by many people, so better don't use right outer joins, but simply swap tables then:
select * from t2 left join t1 on t1.colx = t2.coly
FULL OUTER JOIN (or short: FULL JOIN)
select * from t1 full outer join t2 on t1.colx = t2.coly
this gives you all records from both t1 and t2, no matter whether they have a match in the other table or not. Again: You get all records you'd get with an inner join plus all t1 with no t2 match plus all t2 with no t1 match.
When having several full outer joins the USING clause can come in handy:
select product, sum(p1.amount), sum(p2.amount), sum(p3.amount)
from p1
full outer join p2 using (product)
full outer join p3 using (product);
CROSS JOIN
A cross join joins a table without any criteria, so as to combine each of its records with each of the records already present. This is used to get all combinations and usually followed by a left outer join:
select products.product_id, regions.region_id, count(*)
from products
cross join regions
left join sales on sales.product_id = products.product_id
and sales.region_id = regions.region_id
group by products.product_id, regions.region_id
order by products.product_id, regions.region_id;
This gives you all possible combinations of products and regions and counts the sales therein. So you get a result record even for product / region combinations where nothing was sold (i.e. no entry in table sales).
NATURAL JOIN
looks at common column names to magically join tables. My simple advice: never use this join type.
ANTI JOIN
This is not a join type actually, but a usage of a join, namely an outer join. Here you want to get all records from a table except the matches. You achieve this by outer-joining the tables and then removing matches in the where clause.
select t1.*
from t1
left join t2 on t1.colx = t2.coly
where t2.coly is null;
This looks queer, because we have EXISTS (and IN) to check for existence:
select *
from t1
where not exists (select * from t2 where t2.coly = t1.colx);
So why would one obfuscate things and use the anti join pattern instead? It is a trick used on weak DBMS. When a DBMS is written, joins are the most important thing and the developers of the DBMS put all their effort into making them fast. They may neglect EXISTS and IN at first and only later care about their performance. So it may help then to use a join technique (the anti join) instead. My recommendation: Only use the anti join pattern when running into performance issues with a straight-forward query. So far I've never had to use anti joins it in more than twenty years. (It's good to have that option though. And it's good to know about them, so as to not be confused when stumbling upon such query some time :-)
You can join the queries:
select *
from (your query 1 here) query1
join (your query 2 here) query2 on query2.pol# = query1.pol#;
The same with WITH clauses:
with query1 as (your query 1 here),
query2 as (your query 2 here)
select *
from query1
join query2 on query2.pol# = query1.pol#;
I am new to SQL and if you have a spare moment, I was wondering whether anybody could help me replicate the Excel Vlookup function in SQL please?
From some research, I am suspecting that it is one of the join functions that I require, however, I don't want to just select data that is contained in both tables - I just want to lookup the value in 1 table against another.
If the data is contained in the lookup table then return the value and if not, just return NULL.
I have given a couple of example tables below to help illustrate my question.
Please note that Products 'C' and 'D' are not in Table2 but they are still in the result table but with NULL value.
Also I have a large number of unique products, so I am not looking for an answer which includes hard-coding, for example; CASE WHEN [Product] = 'A' THEN...
TABLE1
Product Quantity
-------------------
A 10
B 41
D 2
C 5
B 16
A 19
C 17
A 21
TABLE 2
Product Cost
-----------------
A £31.45
B £97.23
RESULT TABLE
Product Quantity Cost
-----------------------------
A 10 £31.45
B 41 £97.23
D 2 NULL
C 5 NULL
B 16 £97.23
A 19 £31.45
C 17 NULL
A 21 £31.45
It looks as if you need an outer join, I'll use a left one in my example:
select t1.Product, t1.Quantity, t2.Cost
from table1 as t1
left outer join table2 as t2
on t1.Product = t2.Product
You can also leave out the outer keyword:
select t1.Product, t1.Quantity, t2.Cost
from table1 as t1
left join table2 as t2
on t1.Product = t2.Product
Here's an updated version of Lennart's answer which really works great.
select *
from table1 as t1
left outer join table2 as t2
on t1.Product = t2.Product
and t2.Product <> ''
left outer join table3 as t3
on t1.Product = t3.Product2
and t3.Product2 <> ''
The point is, you need to exclude rows where the join table column is blank, otherwise you will return way too many rows then table1 has. A true vlookup does not add any rows to the left table.
I even added a third table for effect.
I have four tables:
T1
ID ID1 TITLE
1 100 TITLE1
2 100 TITLE2
3 100 TITLE3
T2
ID TEXT
1 LONG1
2 LONG2
T3
ID1 ID2
100 200
T4
ID4 ID2 SUBJECT
1 200 A
2 200 B
3 200 C
4 200 D
5 200 E
I want output in this result format:
TITLE TEXT SUBJECT
TITLE1 LONG1 A
TITLE2 LONG2 B
TITLE3 null C
null null D
null null E
So I made this query but it gives me much more results than it should be.On example titles asre displayed more times than just once etc.
SELECT
t1.title,
t2.text,
t4.subject
FROM t1
LEFT OUTER JOIN t2 ON t1.id=t2.id
INNER JOIN t3 ON t1.id1=t3.id1
LEFT OUTER JOIN t4 ON t4.id2=t3.id2
WHERE
t1.id1=100
Thanks for help
Disclaimer: I don't work with DB2. After some browsing through documentation I have found that DB2 supports row_number() and full outer join, but I might easily be wrong.
To get rid of n:m relationship one has to build additional key. In this case simple solution is to add row number to each record in t1 and t4 and use it as join condition. Row_number does just that, produces numbers for groups of data defined by partition by in ascending sequence in order defined by order by.
As there is difference in number of records in t1 and t4, and it is unknown which one always has more records, I use full outer join to join them.
You can see the test (Sql Server version) # Sql Fiddle.
select t1_rn.title,
t2.[text],
t4_rn.subject
from
(
select t1.id,
t1.title,
t1.id1,
t3.id2,
row_number() over(partition by t1.id1
order by id) rn
from t1
inner join t3
on t1.id1 = t3.id1
) t1_rn
full outer join
(
select t4.subject,
t3.id1,
t4.id2,
row_number() over(partition by t4.id2
order by id4) rn
from t4
inner join t3
on t4.id2 = t3.id2
) t4_rn
on t1_rn.id1 = t4_rn.id1
and t1_rn.id2 = t4_rn.id2
and t1_rn.rn = t4_rn.rn
left join t2
on t1_rn.id = t2.id
This kind of work should definitely be done on presentation side of an application, but I believe that software you are using requires already prepared data.
try this :
select t1.title,t2.text,t4.subject
from t4
left join t3
on t4.id2=t3.id2
left join t1
on t1.id1=t3.id1
left join t2
on t1.id=t2.id
where t1.id=100
You should change your tables. Your last join does that to your output -just analyze your query. for every record from T1 you have every record from T4.
Outer joins are guaranteed to replicate rows, instead of matching only the ones you need. You may want to look at this:
http://blog.sqlauthority.com/2009/04/13/sql-server-introduction-to-joins-basic-of-joins/
To understand what the join types are, and how you can use them.
You are looking for a list of subjects, with associated text and title, but this may not be unique; more than one null exist for each of the titles. You want to drive the join from table 4, and get a list of subjects, with associated titles for each.
Looking at your ouput it appears you want all subjects displayed. Knowing this you should first off build everything off this table.
SELECT columns
FROM T4
Next build up your inner joins.
SELECT columns
FROM T4 subjectTable
INNER JOIN T3 mapTable
ON mapTable.ID2 = subjectTable.ID2
When happy with them, add on your optional columns with the outer join.
SELECT columns
FROM T4 subjectTable
INNER JOIN T3 mapTable
ON mapTable.ID2 = subjectTable.ID2
LEFT OUTER JOIN T2 textTable
ON textTable.ID = subjectTable.ID4
LEFT OUTER JOIN T1 titleTable
ON titleTable.ID1 = mapTable.ID1
WHERE
subjectTable.ID = 100;
For example, there are two tables:
create table Table1 (id int, Name varchar (10))
create table Table2 (id int, Name varchar (10))
Table1 data as follows:
Id Name
-------------
1 A
2 B
Table2 data as follows:
Id Name
-------------
1 A
2 B
3 C
If I execute both below mentioned SQL statements, both outputs will be the same:
select *
from Table1
left join Table2 on Table1.id = Table2.id
select *
from Table2
right join Table1 on Table1.id = Table2.id
Please explain the difference between left and right join in the above SQL statements.
Select * from Table1 left join Table2 ...
and
Select * from Table2 right join Table1 ...
are indeed completely interchangeable. Try however Table2 left join Table1 (or its identical pair, Table1 right join Table2) to see a difference. This query should give you more rows, since Table2 contains a row with an id which is not present in Table1.
Table from which you are taking data is 'LEFT'.
Table you are joining is 'RIGHT'.
LEFT JOIN: Take all items from left table AND (only) matching items from right table.
RIGHT JOIN: Take all items from right table AND (only) matching items from left table.
So:
Select * from Table1 left join Table2 on Table1.id = Table2.id
gives:
Id Name
-------------
1 A
2 B
but:
Select * from Table1 right join Table2 on Table1.id = Table2.id
gives:
Id Name
-------------
1 A
2 B
3 C
you were right joining table with less rows on table with more rows
AND
again, left joining table with less rows on table with more rows
Try:
If Table1.Rows.Count > Table2.Rows.Count Then
' Left Join
Else
' Right Join
End If
You seem to be asking, "If I can rewrite a RIGHT OUTER JOIN using LEFT OUTER JOIN syntax then why have a RIGHT OUTER JOIN syntax at all?" I think the answer to this question is, because the designers of the language didn't want to place such a restriction on users (and I think they would have been criticized if they did), which would force users to change the order of tables in the FROM clause in some circumstances when merely changing the join type.
select fields
from tableA --left
left join tableB --right
on tableA.key = tableB.key
The table in the from in this example tableA, is on the left side of relation.
tableA <- tableB
[left]------[right]
So if you want to take all rows from the left table (tableA), even if there are no matches in the right table (tableB), you'll use the "left join".
And if you want to take all rows from the right table (tableB), even if there are no matches in the left table (tableA), you will use the right join.
Thus, the following query is equivalent to that used above.
select fields
from tableB
right join tableA on tableB.key = tableA.key
Your two statements are equivalent.
Most people only use LEFT JOIN since it seems more intuitive, and it's universal syntax - I don't think all RDBMS support RIGHT JOIN.
I feel we may require AND condition in where clause of last figure of Outer Excluding JOIN so that we get the desired result of A Union B Minus A Interaction B.
I feel query needs to be updated to
SELECT <select_list>
FROM Table_A A
FULL OUTER JOIN Table_B B
ON A.Key = B.Key
WHERE A.Key IS NULL AND B.Key IS NULL
If we use OR , then we will get all the results of A Union B
select *
from Table1
left join Table2 on Table1.id = Table2.id
In the first query Left join compares left-sided table table1 to right-sided table table2.
In Which all the properties of table1 will be shown, whereas in table2 only those properties will be shown in which condition get true.
select *
from Table2
right join Table1 on Table1.id = Table2.id
In the first query Right join compares right-sided table table1 to left-sided table table2.
In Which all the properties of table1 will be shown, whereas in table2 only those properties will be shown in which condition get true.
Both queries will give the same result because the order of table declaration in query are different like you are declaring table1 and table2 in left and right respectively in first left join query, and also declaring table1 and table2 in right and left respectively in second right join query.
This is the reason why you are getting the same result in both queries. So if you want different result then execute this two queries respectively,
select *
from Table1
left join Table2 on Table1.id = Table2.id
select *
from Table1
right join Table2 on Table1.id = Table2.id
Select * from Table1 t1 Left Join Table2 t2 on t1.id=t2.id
By definition: Left Join selects all columns mentioned with the "select" keyword from Table 1 and the columns from Table 2 which matches the criteria after the "on" keyword.
Similarly,By definition: Right Join selects all columns mentioned with the "select" keyword from Table 2 and the columns from Table 1 which matches the criteria after the "on" keyword.
Referring to your question, id's in both the tables are compared with all the columns needed to be thrown in the output. So, ids 1 and 2 are common in the both the tables and as a result in the result you will have four columns with id and name columns from first and second tables in order.
*select *
from Table1
left join Table2 on Table1.id = Table2.id
The above expression,it takes all the records (rows) from table 1 and columns, with matching id's from table 1 and table 2, from table 2.
select *
from Table2
right join Table1 on Table1.id = Table2.id**
Similarly from the above expression,it takes all the records (rows) from table 1 and columns, with matching id's from table 1 and table 2, from table 2. (remember, this is a right join so all the columns from table2 and not from table1 will be considered).