I have 4 tables in SQL Server 2012. This is my diagram:
I have this query:
SELECT
pc.Product_ID, d.Dept_ID, c.Category_ID, sc.SubCategory_ID
FROM
dbo.ProductConfiguration pc
INNER JOIN
dbo.SubCategory sc ON sc.SubCategory_ID = pc.SubCategory_ID
INNER JOIN
dbo.Category c ON c.Category_ID = sc.Category_ID
INNER JOIN
dbo.Department d ON d.Dept_ID = c.Dept_ID
WHERE
pc.Product_ID = 459218
What is the best way, (INNER, LEFT, RIGHT) to get columns values? I need be careful with performance
Thanks a lot
This has nothing to do with performance.
I suppose every product belongs to a subcategory which belongs to a category which belongs to a department. So you inner join all the tables; this is the normal thing to do.
If one of the entities were not mandatory, but optional, say there are categories that don't belong to a department 1, but you still wanted to show the row, then you'd outer join the table. You'd use a left outer join on departments. We never use right outer joins, because these are supposed to be harder to read, and they do essentially the same as left joins.
1) i.e. category.dept_id coud be null
your query best variant is LEFT JOIN, from its do not loses your information
I would recommend to you, use execution plans when you need to update the performance. using execution plan you can understand the behaviors. it will grate help for future.
Thank You!
Related
I have multiple SQL queries that look similar where one uses JOIN and another LEFT OUTER JOIN. I played around with SQL and found that it the same results are returned. The codebase uses JOIN and LEFT OUTER JOIN interchangeably. While LEFT JOIN seems to be interchangeable with LEFT OUTER JOIN, I cannot I cannot seem to find any information about only JOIN. Is this good practice?
Ex Query1 using JOIN
SQL
SELECT
id,
name
FROM
u_users customers
JOIN
t_orders orders
ON orders.status=='PAYMENT PENDING'
Ex. Query2 using LEFT OUTER JOIN
SQL
SELECT
id,
name
FROM
u_users customers
LEFT OUTER JOIN
t_orders orders
ON orders.status=='PAYMENT PENDING'
As previously noted above:
JOIN is synonym of INNER JOIN. It's definitively different from all
types of OUTER JOIN
So the question is "When should I use an outer join?"
Here's a good article, with several great diagrams:
https://www.sqlshack.com/sql-outer-join-overview-and-examples/
The short answer your your question is:
Prefer JOIN (aka "INNER JOIN") to link two related tables. In practice, you'll use INNER JOIN most of the time.
INNER JOIN is the intersection of the two tables. It's represented by the "green" section in the middle of the Venn diagram above.
Use an "Outer Join" when you want the left, right or both outer regions.
In your example, the result set happens to be the same: the two expressions happen to be equivalent.
ALSO: be sure to familiarize yourself with "Show Plan" (or equivalent) for your RDBMS: https://www.sqlshack.com/execution-plans-in-sql-server/
'Hope that helps...
First the theory:
A join is a subset of the left join (all other things equal). Under some circumstances they are identical
The difference is that the left join will include all the tuples in the left hand side relation (even if they don't match the join predicate), while the join will only include the tuples of the left hand side that match the predicate.
For instance assume we have to relations R and S.
Say we have to do R JOIN S (and R LEFT JOIN S) on some predicate p
J = R JOIN S on (p)
Now, identify the tuples of R that are not in J.
Finally, add those tuples to J (padding any attribute in J not in R with null)
This result is the left join:
R LEFT JOIN S (p)
So when all the tuples of the left hand side of the relation are in the JOIN, this result will be identical to the Left Join.
back to you problem:
Your JOIN is very likely to include all the tuples from Users. So the query is the same if you use JOIN or LEFT JOIN.
The two are exactly equivalent, because the WHERE clause turns the LEFT JOIN into an INNER JOIN.
When filtering on all but the first table in a LEFT JOIN, the condition should usually be in the ON clause. Presumably, you also have a valid join condition, connecting the two tables:
SELEC id, name
FROM u_users u LEFT JOIN
t_orders o
ON o.user_id = u.user_id AND o.status = 'PAYMENT PENDING';
This version differs from the INNER JOIN version, because this version returns all users even those with no pending payments.
Both are the same, there is no difference here.
You need to use the ON clause when using Join. It can match any data between two tables when you don't use the ON clause.
This can cause performance issue as well as map unwanted data.
If you want to see the differences you can use "execution plans".
for example, I used the Microsoft AdventureWorks database for the example.
LEFT OUTER JOIN :
LEFT JOIN :
If you use the ON clause as you wrote, there is a possibility of looping.
Example "execution plans" is below.
You can access the correct mapping and data using the ON clause complement.
select
id,
name
from
u_users customers
left outer join
t_orders orders on customers.id = orders.userid
where orders.status=='payment pending'
I am importing a view from an old database as part of a software upgrade and I saw this FROM clause as I was working.
FROM
t_store_master AS STM
RIGHT OUTER JOIN
dbo.t_shipping AS SHP
INNER JOIN
t_hu_master AS HUM
INNER JOIN
t_stored_item AS STO ON HUM.hu_id = STO.hu_id
ON SHP.shipment_id = HUM.reserved_for
AND SHP.location_id = HUM.location_id
INNER JOIN
dbo.t_carrier AS CAR ON SHP.carrier_code = CAR.carrier_code
LEFT OUTER JOIN
dbo.t_order_detail AS ORD
INNER JOIN
dbo.t_order AS ORM ON ORD.order_number = ORM.order_number
INNER JOIN
dbo.t_pick_detail PDL ON ORD.order_number = PDL.order_number
ON STO.type = PDL.pick_id
ON STM.store_id = HUM.control_number_2
It is my understanding that OUTER JOIN requires an ON clause in order to be interpreted properly by the compiler. However, there must be some edge case scenarios I am unaware of because the code above returns good data without throwing any errors.
After some goggling and reading up on MSDN standards for OUTER JOINs, I am still at a loss for what the LEFT OUTER JOIN and RIGHT OUTER JOIN are doing in this query.
I do know that the multiple ON clauses beneath the INNER JOINs are necessary. There seems to be some kind of hidden or implied mapping being done between the ON clauses and the OUTER JOINs. Past that I cannot tell what the purpose of writing a query this way would be.
Could someone shed some insight on how this works and why it would be written this way?
This is allowed. It makes slightly more sense with parentheses:
FROM t_store_master STM RIGHT OUTER JOIN
(dbo.t_shipping SHP INNER JOIN
(t_hu_master HUM INNER JOIN
t_stored_item STO
ON HUM.hu_id = STO.hu_id
)
ON SHP.shipment_id = HUM.reserved_for AND
SHP.location_id = HUM.location_id
) INNER JOIN
(dbo.t_carrier CAR
ON SHP.carrier_code = CAR.carrier_code LEFT OUTER JOIN
((dbo.t_order_detail ORD INNER JOIN
dbo.t_order ORM
ON ORD.order_number = ORM.order_number
) INNER JOIN
dbo.t_pick_detail PDL
ON ORD.order_number = PDL.order_number
)
ON STO.type = PDL.pick_id
)
ON STM.store_id = HUM.control_number_2
That said, I would recommend never writing a query like this and rewriting the query ASAP if it is in production code. At the very least, add the parentheses!
Parentheses are almost never needed in the FROM clause to express JOINs (there is one case where I do happen to use them). Non-interleaved ON clauses are never needed -- or at least, I have never had occasion to use them or think they were the best way to write a query. But, both are allowed.
I have a query in my WordPress plugin like this:
SELECT users.*, U.`meta_value` AS first_name,M.`meta_value` AS last_name
FROM `nwp_users` AS users
LEFT JOIN `nwp_usermeta` U
ON users.`ID`=U.`user_id`
LEFT JOIN `nwp_usermeta` M
ON users.`ID`=M.`user_id`
LEFT JOIN `nwp_usermeta` C
ON users.`ID`=C.`user_id`
WHERE U.meta_key = 'first_name'
AND M.meta_key = 'last_name'
AND C.meta_key = 'nwp_capabilities'
ORDER BY users.`user_login` ASC
LIMIT 0,10
I'm new to using JOIN and I'm wondering how efficient it is to use so many JOIN in one query. Is it better to split it up into multiple queries?
The database schema can be found here.
JOIN usually isn't so bad if the keys are indexed. LEFT JOIN is almost always a performance hit and you should avoid it if possible. The difference is that LEFT JOIN will join all rows in the joined table even if the column you're joining is NULL. While a regular (straight) JOIN just joins the rows that match.
Post your table structure and we can give you a better query.
See this comment:
http://forums.mysql.com/read.php?24,205080,205274#msg-205274
For what it's worth, to find out what MySQL is doing and to see if you have indexed properly, always check the EXPLAIN plan. You do this by putting EXPLAIN before your query (literally add the word EXPLAIN before the query), then run it.
In your query, you have a filter AND C.meta_key = 'nwp_capabilities' which means that all the LEFT JOINs above it can be equally written as INNER JOINs. Because if the LEFT JOINS fail (LEFT OUTER is intended to preserve the results from the left side), the result will 100% be filtered out by the WHERE clause.
So a more optimal query would be
SELECT users.*, U.`meta_value` AS first_name,M.`meta_value` AS last_name
FROM `nwp_users` AS users
JOIN `nwp_usermeta` U
ON users.`ID`=U.`user_id`
JOIN `nwp_usermeta` M
ON users.`ID`=M.`user_id`
JOIN `nwp_usermeta` C
ON users.`ID`=C.`user_id`
WHERE U.meta_key = 'first_name'
AND M.meta_key = 'last_name'
AND C.meta_key = 'nwp_capabilities'
ORDER BY users.`user_login` ASC
LIMIT 0,10
(note: "JOIN" (alone) = "INNER JOIN")
Try explaining the query to see what is going on and if your select if optimized. If you haven't used explain before read some tutorials:
http://www.learn-mysql-tutorial.com/OptimizeQueries.cfm
http://www.databasejournal.com/features/mysql/article.php/1382791/Optimizing-MySQL-Queries-and-Indexes.htm
I've never learned how joins work but just using select and the where clause has been sufficient for all the queries I've done. Are there cases where I can't get the right results using the WHERE clause and I have to use a JOIN? If so, could someone please provide examples? Thanks.
Implicit joins are more than 20 years out-of-date. Why would you even consider writing code with them?
Yes, they can create problems that explicit joins don't have. Speaking about SQL Server, the left and right join implicit syntaxes are not guaranteed to return the correct results. Sometimes, they return a cross join instead of an outer join. This is a bad thing. This was true even back to SQL Server 2000 at least, and they are being phased out, so using them is an all around poor practice.
The other problem with implicit joins is that it is easy to accidentally do a cross join by forgetting one of the where conditions, especially when you are joining too many tables. By using explicit joins, you will get a syntax error if you forget to put in a join condition and a cross join must be explicitly specified as such. Again, this results in queries that return incorrect values or are fixed by using distinct to get rid of the cross join which is inefficient at best.
Moreover, if you have a cross join, the maintenance developer who comes along in a year to make a change doesn't know if it was intended or not when you use implicit joins.
I believe some ORMs also now require explicit joins.
Further, if you are using implied joins because you don't understand how joins operate, chances are high that you are writing code that, in fact, does not return the correct result because you don't know how to evaluate what the correct result would be since you don't understand what a join is meant to do.
If you write SQL code of any flavor, there is no excuse for not thoroughly understanding joins.
Yes. When doing outer joins. You can read this simple article on joins. Joins are not hard to understand at all so you should start learning (and using them where appropriate) right away.
Are there cases where I can't get the right results using the WHERE clause and I have to use a JOIN?
Any time your query involves two or more tables, a join is being used. This link is great for showing the differences in joins with pictures as well as sample result sets.
If the join criteria is in the WHERE clause, then the ANSI-89 JOIN syntax is being used. The reason for the newer JOIN syntax in the ANSI-92 format, is that it made LEFT JOIN more consistent across various databases. For example, Oracle used (+) on the side that was optional while in SQL Server you had to use =*.
Implicit join syntax by default uses Inner joins. It is sometimes possible to modify the implicit join syntax to specify outer joins, but it is vendor dependent in my experience (i know oracle has the (-) and (+) notation, and I believe sqlserver uses *= ). So, I believe your question can be boiled down to understanding the differences between inner and outer joins.
We can look at a simple example for an inner vs outer join using a simple query..........
The implicit INNER join:
select a.*, b.*
from table a, table b
where a.id = b.id;
The above query will bring back ONLY rows where the 'a' row has a matching row in 'b' for it's 'id' field.
The explicit OUTER JOIN:
select * from
table a LEFT OUTER JOIN table b
on a.id = b.id;
The above query will bring back EVERY row in a, whether or not it has a matching row in 'b'. If no match exists for 'b', the 'b' fields will be null.
In this case, if you wanted to bring back EVERY row in 'a' regardless of whether it had a corresponding 'b' row, you would need to use the outer join.
Like I said, depending on your database vendor, you may still be able to use the implicit join syntax and specify an outer join type. However, this ties you to that vendor. Also, any developers not familiar wit that specialized syntax may have difficulty understanding your query.
Any time you want to combine the results of two tables you'll need to join them. Take for example:
Users table:
ID
FirstName
LastName
UserName
Password
and Addresses table:
ID
UserID
AddressType (residential, business, shipping, billing, etc)
Line1
Line2
City
State
Zip
where a single user could have his home AND his business address listed (or a shipping AND a billing address), or no address at all. Using a simple WHERE clause won't fetch a user with no addresses because the addresses are in a different table. In order to fetch a user's addresses now, you'll need to do a join as:
SELECT *
FROM Users
LEFT OUTER JOIN Addresses
ON Users.ID = Addresses.UserID
WHERE Users.UserName = "foo"
See http://www.w3schools.com/Sql/sql_join.asp for a little more in depth definition of the different joins and how they work.
Using Joins :
SELECT a.MainID, b.SubValue AS SubValue1, b.SubDesc AS SubDesc1, c.SubValue AS SubValue2, c.SubDesc AS SubDesc2
FROM MainTable AS a
LEFT JOIN SubValues AS b ON a.MainID = b.MainID AND b.SubTypeID = 1
LEFT JOIN SubValues AS c ON a.MainID = c.MainID AND b.SubTypeID = 2
Off-hand, I can't see a way of getting the same results as that by using a simple WHERE clause to join the tables.
Also, the syntax commonly used in WHERE clauses to do left and right joins (*= and =*) is being phased out,
Oracle supports LEFT JOIN and RIGHT JOIN using their special join operator (+) (and SQL Server used to support *= and =* on join predicates, but no longer does). But a simple FULL JOIN can't be done with implicit joins alone:
SELECT f.title, a.first_name, a.last_name
FROM film f
FULL JOIN film_actor fa ON f.film_id = fa.film_id
FULL JOIN actor a ON fa.actor_id = a.actor_id
This produces all films and their actors including all the films without actor, as well as the actors without films. To emulate this with implicit joins only, you'd need unions.
-- Inner join part
SELECT f.title, a.first_name, a.last_name
FROM film f, film_actor fa, actor a
WHERE f.film_id = fa.film_id
AND fa.actor_id = a.actor_id
-- Left join part
UNION ALL
SELECT f.title, null, null
FROM film f
WHERE NOT EXISTS (
SELECT 1
FROM film_actor fa
WHERE fa.film_id = f.film_id
)
-- Right join part
UNION ALL
SELECT null, a.first_name, a.last_name
FROM actor a
WHERE NOT EXISTS (
SELECT 1
FROM film_actor fa
WHERE fa.actor_id = a.actor_id
)
This will quickly become very inefficient both syntactically as well as from a performance perspective.
I use INNER JOIN and LEFT OUTER JOINs all the time. However, I never seem to need RIGHT OUTER JOINs, ever.
I've seen plenty of nasty auto-generated SQL that uses right joins, but to me, that code is impossible to get my head around. I always need to rewrite it using inner and left joins to make heads or tails of it.
Does anyone actually write queries using Right joins?
In SQL Server one edge case where I have found right joins useful is when used in conjunction with join hints.
The following queries have the same semantics but differ in which table is used as the build input for the hash table (it would be more efficient to build the hash table from the smaller input than the larger one which the right join syntax achieves)
SELECT #Large.X
FROM #Small
RIGHT HASH JOIN #Large ON #Small.X = #Large.X
WHERE #Small.X IS NULL
SELECT #Large.X
FROM #Large
LEFT HASH JOIN #Small ON #Small.X = #Large.X
WHERE #Small.X IS NULL
Aside from that (product specific) edge case there are other general examples where a RIGHT JOIN may be useful.
Suppose that there are three tables for People, Pets, and Pet Accessories. People may optionally have pets and these pets may optionally have accessories
CREATE TABLE Persons
(
PersonName VARCHAR(10) PRIMARY KEY
);
INSERT INTO Persons
VALUES ('Alice'),
('Bob'),
('Charles');
CREATE TABLE Pets
(
PetName VARCHAR(10) PRIMARY KEY,
PersonName VARCHAR(10)
);
INSERT INTO Pets
VALUES ('Rover',
'Alice'),
('Lassie',
'Alice'),
('Fifi',
'Charles');
CREATE TABLE PetAccessories
(
AccessoryName VARCHAR(10) PRIMARY KEY,
PetName VARCHAR(10)
);
INSERT INTO PetAccessories
VALUES ('Ball', 'Rover'),
('Bone', 'Rover'),
('Mouse','Fifi');
If the requirement is to get a result listing all people irrespective of whether or not they own a pet and information about any pets they own that also have accessories.
This doesn't work (Excludes Bob)
SELECT P.PersonName,
Pt.PetName,
Pa.AccessoryName
FROM Persons P
LEFT JOIN Pets Pt
ON P.PersonName = Pt.PersonName
INNER JOIN PetAccessories Pa
ON Pt.PetName = Pa.PetName;
This doesn't work (Includes Lassie)
SELECT P.PersonName,
Pt.PetName,
Pa.AccessoryName
FROM Persons P
LEFT JOIN Pets Pt
ON P.PersonName = Pt.PersonName
LEFT JOIN PetAccessories Pa
ON Pt.PetName = Pa.PetName;
This does work (but the syntax is much less commonly understood as it requires two ON clauses in succession to achieve the desired logical join order)
SELECT P.PersonName,
Pt.PetName,
Pa.AccessoryName
FROM Persons P
LEFT JOIN Pets Pt
INNER JOIN PetAccessories Pa
ON Pt.PetName = Pa.PetName
ON P.PersonName = Pt.PersonName;
All in all probably easiest to use a RIGHT JOIN
SELECT P.PersonName,
Pt.PetName,
Pa.AccessoryName
FROM Pets Pt
JOIN PetAccessories Pa
ON Pt.PetName = Pa.PetName
RIGHT JOIN Persons P
ON P.PersonName = Pt.PersonName;
Though if determined to avoid this another option would be to introduce a derived table that can be left joined to
SELECT P.PersonName,
T.PetName,
T.AccessoryName
FROM Persons P
LEFT JOIN (SELECT Pt.PetName,
Pa.AccessoryName,
Pt.PersonName
FROM Pets Pt
JOIN PetAccessories Pa
ON Pt.PetName = Pa.PetName) T
ON T.PersonName = P.PersonName;
SQL Fiddles: MySQL, PostgreSQL, SQL Server
It depends on what side of the join you put each table.
If you want to return all rows from the left table, even if there are no matches in the right table... you use left join.
If you want to return all rows from the right table, even if there are no matches in the left table, you use right join.
Interestingly enough, I rarely used right joins.
You usually use RIGHT OUTER JOINS to find orphan items in other tables.
No, I don't for the simple reason I can accomplish everything with inner or left joins.
I only use left, but let me say they are really the same depending how how you order things. I worked with some people that only used right, becasue they built queries from the inside out and liked to keep their main items at the bottom thus in their minds it made sense to only use right.
I.e.
Got main thing here
need more junk
More Junk right outer join Main Stuff
I prefer to do main stuff then junk... So left outer works for me.
So whatever floats your boat.
The only time I use a Right outer join is when I am working on an existing query and need to change it (normally from an inner). I could reverse the join and make it a left and probably be ok, but I try and reduce the amount of things I change when I modify code.
Our standard practice here is to write everything in terms of LEFT JOINs if possible. We've occasionally used FULL OUTER JOINs if we've needed to, but never RIGHT JOINs.
You can accomplish the same thing using LEFT or RIGHT joins. Generally most people think in terms of a LEFT join probably because we read from left to right. It really comes down to being consistent. Your team should focus on using either LEFT or RIGHT joins, not both, as they are essentially the same exact thing, written differently.
Rarely, as stated you can usually reorder and use a left join. Also I naturally tend to order the data, so that left joins work for getting the data I require. I think the same can be said of full outer and cross joins, most people tend to stay away from them.