using parenthesis in SQL - sql

What's the differences between these SQLs:
SELECT *
FROM COURS NATURAL JOIN COMPOSITIONCOURS NATURAL JOIN PARCOURS;
SELECT *
FROM COURS NATURAL JOIN (COMPOSITIONCOURS C JOIN PARCOURS P ON C.IDPARCOURS = P.IDPARCOURS) ;
SELECT *
FROM (COURS NATURAL JOIN COMPOSITIONCOURS C) JOIN PARCOURS P ON C.IDPARCOURS = P.IDPARCOURS ;
They have different results.

It's difficult to tell precisely without a sample result set, but I would imagine its your utilization of natural JOINs which is typically bad practice. Reasons being:
I would avoid using natural joins, because natural joins are:
Not standard SQL.
Not informative. You aren't specifying what columns are being joined without referring to the schema.
Your conditions are vulnerable to schema changes. If there are multiple natural join columns and one such column is removed from a table, the query will still execute, but probably not correctly and this change in behavior will be silent.

Thanks for all of yours answers! I finally find out that the problem is because of the use of natural JoIn.
When using natural join, it will join every column with the same name! When I am using NATURAL JOIN for several times , it's possible those columns, which have same names but you don't wants to combine , could be joined automatically!
Here is an example,
Table a (IDa, name,year,IDb)
Table b (IDb, bonjour,salute,IDc)
Table c (IDc, year, merci)
If I wrote like From a Natural Join b Natural c,because IDb and IDc are common names for Table a,b,c. It seems Ok! But attention please!
a.year and b.year can also be joined ! That's what we don't want!

Related

Access SQL subquery access field from parent

I have a SQL query that works in Access 2016:
SELECT
Count(*) AS total_tests,
Sum(IIf(score>=securing_threshold And score<mastering_threshold,1,0)) AS total_securing,
Sum(IIf(score>=mastering_threshold,1,0)) AS total_mastering,
total_securing/Count(*) AS percent_securing,
total_mastering/Count(*) AS percent_mastering,
(Count(*)-total_securing-total_mastering)/Count(*) AS percent_below,
subjects.subject,
students.year_entered,
IIf(Month(Date())<9,Year(Date())-students.year_entered+6,Year(Date())-students.year_entered+7) AS current_form,
groups.group
FROM
((subjects
INNER JOIN tests ON subjects.ID = tests.subject)
INNER JOIN (students
INNER JOIN test_results ON students.ID = test_results.student) ON tests.ID = test_results.test)
LEFT JOIN
(SELECT * FROM group_membership LEFT JOIN groups ON group_membership.group = groups.ID) As g
ON students.ID = g.student
GROUP BY subjects.subject, students.year_entered, groups.group;
However, I wish to filter out irrelevant groups before joining them to my table. The table groups has a column subject which is a foreign key.
When I try changing ON students.ID = g.student to ON students.ID = g.student And subjects.ID = g.subject I get the error 'JOIN expression not supported'.
Alternatively, when I try adding WHERE subjects.ID = groups.subject to the subquery, it asks me for the parameter value of subjects.ID, although it is a column in the parent query.
Googling reveals many similar errors but they were all resolved by changing the brackets. That didn't help.
Just in case the table relationships help:
Thank you.
EDIT: Sample database at https://www.dropbox.com/s/yh80oooem6gsni7/student%20tracker.ACCDB?dl=0
MS Access queries with many joins are difficult to update by SQL alone as parenetheses pairings are required unlike other RDBMS's and these pairings must follow an order. Moreover, some pairings can even be nested. Hence, for beginners it is advised to build queries with complex and many joins using the query design GUI in the MS Office application and let it build out the SQL.
For a simple filter on the g derived table, you could filter subject on the derived table, g, but likely you want want all subjects:
...
(SELECT * FROM group_membership
LEFT JOIN groups ON group_membership.group = groups.ID
WHERE groups.subject='Earth Science') As g
...
So for all subjects, consider re-building query from scratch in GUI that nearly mirrors your table relationships which actually auto-links joins in the GUI. Then, drop unneeded tables.
Usually you want to begin with the join table or set like groups and group_membership or tests and test_results. In fact, consider saving the g derived table as its own query.
Then add the distinct record primary source tables like students and subjects.
You may even need to play around with order in FROM and JOIN clauses to attain desired results, and maybe even add the same table in query. And be careful with adding join tables like group_membership (two one-to-many links), to GROUP BY queries as it leads to the duplicate record aggregation. So you may need to join aggregates queries by subject.
Unless you can post content of all tables, from our perspective it is difficult to help from here.
Your subquery g uses a LEFT JOIN, but there is a enforced 1:n relation between the two tables, so there will always be a matching group. Use a INNER JOIN instead.
With g.subject you are trying to join on a column that is on the right side of a left join, that cannot really work.
Also you shouldn't use SELECT * on a join of tables with identical column names. Include only the qualified column names that you need.
LEFT JOIN
(SELECT group_membership.student, groups.group, groups.subject
FROM group_membership INNER JOIN groups
ON group_membership.group = groups.ID) As g
ON (students.ID = g.student AND subjects.ID = g.subject)
I would call the columns in group_membership group_ID and student_ID to avoid confusion.
I don't have the database to test, but I would use subject table as subquery:
(SELECT * FROM subject WHERE filter out what you don't need) Subj
Then INNER JOIN this new Subj Table in your query which would exclude irrelevant groups.
Also I would never create join in WHERE clause (WHERE subjects.ID = groups.subject), what this does it creates cartesian product (table with all the possible combinations of subjects.ID and groups.subject) then it filters out records to satisfy your join. When dealing with huge data it might take forever or crash.
Error related to "Join expression may not be supported"; do datatypes match in those fields?
I solved it by (a lot of trial and error and) taking the advice here to make queries in the GUI and joining them. The end result is 4 queries deep! If the database was bigger, performance would be awful... but now its working I can tweak it eventually.
Thank you everybody!

Commutativity of Inner Join

I read from this answer (click), the following conditional statements
Invoices.CustomerID=Customers.CustomerID
and
Customers.CustomerID=Invoices.CustomerID
are identical because it produces the same result set.
Now, my problem is about commutativity of inner join. I have tried both of the following approaches and they produce the same result set (except for the column order).
Customers table first
use MMABooks
select *
from Customers
inner join Invoices
on Invoices.CustomerID=Customers.CustomerID
where Customers.CustomerID=10
Invoices table first
use MMABooks
select *
from Invoices
inner join Customers
on Invoices.CustomerID=Customers.CustomerID
where Invoices.CustomerID=10
Questions
Is inner join commutative by design?
Is there a best practice that suggest or prefer one approach over the other one? I mean, which approach should I use?
It would be really weird if they didn't produce the same result. Did you expect a difference?
A best practice is to start with the table from which you select most of the columns.
You do have to worry about the order when you work with LEFT or RIGHT JOINS.

JOIN statement not working with where clause

I'm trying to do
select *
from buildings
where levels = 3
join managers;
but it says error at join. I want to match the the id of the building with the id in the managers table so I think I want natural join.
If you put the JOIN in the right place (between FROM and WHERE) and it were legal to write JOIN without ON then the result would be a cross join - a cartesian product that you then filter in WHERE.
That's a perfectly valid thing to do, though not probably what you intended in this case. It can be achieved by adding comma-separate tables to the FROM clause, eg:
FROM buildings, managers
It's generally better style to write an explicit inner join when you intend to join two tables on a condition:
SELECT *
FROM buildings b INNER JOIN managers m ON (b.manager_id = m.manager_id)
WHERE b.levels = 3;
... because it makes it clear to someone else reading the statement that the ON clause is a join condition, and the bl.levels=3 is a filter. The SQL implementation generally doesn't care, and quite likely transforms the above into:
SELECT *
FROM buildings b, managers m
WHERE b.levels = 3 AND b.manager_id = m.manager_id;
internally anyway, but it's easier (IMO) to understand complex queries with many joins when they're written using explicit join syntax.
There's another way to write what you want, but it's dangerous and IMO shouldn't be used:
SELECT *
FROM buildings bl
NATURAL JOIN managers m
WHERE bl.levels = 3;
That JOINs on any columns that're named the same. It's a nightmare to debug, you have to look up the table structures to understand what it does, it breaks if someone renames a column, and it's just painful. Do not use. See Table expressions in the PostgreSQL manual.
More acceptable is the USING syntax also discussed above:
SELECT *
FROM buildings bl
INNER JOIN managers m USING (manager_id)
WHERE bl.levels = 3;
which matches the columns named manager_id in both columns and JOINs on them, combining them into a single column, but unlike NATURAL JOIN does so explicitly and without scary magic.
I still prefer to write INNER JOIN ... ON (...) but it's reasonable to use USING and, IMO, never reasonable to use NATURAL.
Test table structures were:
create table managers ( manager_id integer primary key );
create table buildings (
manager_id integer references manager(manager_id),
levels integer
);
JOIN operator applied to the tables, you should provide it in the FROM clause. If you do so you will get a new error claiming that there is no JOIN condition, because you have to provide the JOIN condition after the ON clause (it is not optional):
SELECT *
FROM buildings b JOIN managers m ON b.managerid = m.id
WHERE b.levels = 3
if you want to do that, you have to write something like:
select *
from bulidings b
join managers m on b.id=m.id
where levels = 3
The request seems odd however, I believe that one of them should have a foreign key into the other. So a more appropriate query would be:
select *
from buidings b
join managers m on b.id = m.building_id //// or b.manager_id = m.id
where levels = 3
I think which query you are written is wrong.
So please try this
select * from buildings as bl
join managers as mn on bl.F_ManagerID = mn.ManagerID
where bl.levels = 3
I think this will help you...

When to use SQL natural join instead of join .. on?

I'm studying SQL for a database exam and the way I've seen SQL is they way it looks on this page:
http://en.wikipedia.org/wiki/Star_schema
IE join written the way Join <table name> On <table attribute> and then the join condition for the selection. My course book and my exercises given to me from the academic institution however, use only natural join in their examples. So when is it right to use natural join? Should natural join be used if the query can also be written using JOIN .. ON ?
Thanks for any answer or comment
A natural join will find columns with the same name in both tables and add one column in the result for each pair found. The inner join lets you specify the comparison you want to make using any column.
IMO, the JOIN ON syntax is much more readable and maintainable than the natural join syntax. Natural joins is a leftover of some old standards, and I try to avoid it like the plague.
A natural join will find columns with the same name in both tables and add one column in the result for each pair found. The inner join lets you specify the comparison you want to make using any column.
The JOIN keyword is used in an SQL statement to query data from two or more tables, based on a relationship between certain columns in these tables.
Different Joins
* JOIN: Return rows when there is at least one match in both tables
* LEFT JOIN: Return all rows from the left table, even if there are no matches in the right table
* RIGHT JOIN: Return all rows from the right table, even if there are no matches in the left table
* FULL JOIN: Return rows when there is a match in one of the tables
INNER JOIN
http://www.w3schools.com/sql/sql_join_inner.asp
FULL JOIN
http://www.w3schools.com/sql/sql_join_full.asp
A natural join is said to be an abomination because it does not allow qualifying key columns, which makes it confusing. Because you never know which "common" columns are being used to join two tables simply by looking at the sql statement.
A NATURAL JOIN matches on any shared column names between the tables, whereas an INNER JOIN only matches on the given ON condition.
The joins often interchangeable and usually produce the same results. However, there are some important considerations to make:
If a NATURAL JOIN finds no matching columns, it returns the cross
product. This could produce disastrous results if the schema is
modified. On the other hand, an INNER JOIN will return a 'column does
not exist' error. This is much more fault tolerant.
An INNER JOIN self-documents with its ON clause, resulting in a
clearer query that describes the table schema to the reader.
An INNER JOIN results in a maintainable and reusable query in
which the column names can be swapped in and out with changes in the
use case or table schema.
The programmer can notice column name mis-matches (e.g. item_ID vs itemID) sooner if they are forced to define the ON predicate.
Otherwise, a NATURAL JOIN is still a good choice for a quick, ad-hoc query.

Queries that implicit SQL joins can't do?

I've never learned how joins work but just using select and the where clause has been sufficient for all the queries I've done. Are there cases where I can't get the right results using the WHERE clause and I have to use a JOIN? If so, could someone please provide examples? Thanks.
Implicit joins are more than 20 years out-of-date. Why would you even consider writing code with them?
Yes, they can create problems that explicit joins don't have. Speaking about SQL Server, the left and right join implicit syntaxes are not guaranteed to return the correct results. Sometimes, they return a cross join instead of an outer join. This is a bad thing. This was true even back to SQL Server 2000 at least, and they are being phased out, so using them is an all around poor practice.
The other problem with implicit joins is that it is easy to accidentally do a cross join by forgetting one of the where conditions, especially when you are joining too many tables. By using explicit joins, you will get a syntax error if you forget to put in a join condition and a cross join must be explicitly specified as such. Again, this results in queries that return incorrect values or are fixed by using distinct to get rid of the cross join which is inefficient at best.
Moreover, if you have a cross join, the maintenance developer who comes along in a year to make a change doesn't know if it was intended or not when you use implicit joins.
I believe some ORMs also now require explicit joins.
Further, if you are using implied joins because you don't understand how joins operate, chances are high that you are writing code that, in fact, does not return the correct result because you don't know how to evaluate what the correct result would be since you don't understand what a join is meant to do.
If you write SQL code of any flavor, there is no excuse for not thoroughly understanding joins.
Yes. When doing outer joins. You can read this simple article on joins. Joins are not hard to understand at all so you should start learning (and using them where appropriate) right away.
Are there cases where I can't get the right results using the WHERE clause and I have to use a JOIN?
Any time your query involves two or more tables, a join is being used. This link is great for showing the differences in joins with pictures as well as sample result sets.
If the join criteria is in the WHERE clause, then the ANSI-89 JOIN syntax is being used. The reason for the newer JOIN syntax in the ANSI-92 format, is that it made LEFT JOIN more consistent across various databases. For example, Oracle used (+) on the side that was optional while in SQL Server you had to use =*.
Implicit join syntax by default uses Inner joins. It is sometimes possible to modify the implicit join syntax to specify outer joins, but it is vendor dependent in my experience (i know oracle has the (-) and (+) notation, and I believe sqlserver uses *= ). So, I believe your question can be boiled down to understanding the differences between inner and outer joins.
We can look at a simple example for an inner vs outer join using a simple query..........
The implicit INNER join:
select a.*, b.*
from table a, table b
where a.id = b.id;
The above query will bring back ONLY rows where the 'a' row has a matching row in 'b' for it's 'id' field.
The explicit OUTER JOIN:
select * from
table a LEFT OUTER JOIN table b
on a.id = b.id;
The above query will bring back EVERY row in a, whether or not it has a matching row in 'b'. If no match exists for 'b', the 'b' fields will be null.
In this case, if you wanted to bring back EVERY row in 'a' regardless of whether it had a corresponding 'b' row, you would need to use the outer join.
Like I said, depending on your database vendor, you may still be able to use the implicit join syntax and specify an outer join type. However, this ties you to that vendor. Also, any developers not familiar wit that specialized syntax may have difficulty understanding your query.
Any time you want to combine the results of two tables you'll need to join them. Take for example:
Users table:
ID
FirstName
LastName
UserName
Password
and Addresses table:
ID
UserID
AddressType (residential, business, shipping, billing, etc)
Line1
Line2
City
State
Zip
where a single user could have his home AND his business address listed (or a shipping AND a billing address), or no address at all. Using a simple WHERE clause won't fetch a user with no addresses because the addresses are in a different table. In order to fetch a user's addresses now, you'll need to do a join as:
SELECT *
FROM Users
LEFT OUTER JOIN Addresses
ON Users.ID = Addresses.UserID
WHERE Users.UserName = "foo"
See http://www.w3schools.com/Sql/sql_join.asp for a little more in depth definition of the different joins and how they work.
Using Joins :
SELECT a.MainID, b.SubValue AS SubValue1, b.SubDesc AS SubDesc1, c.SubValue AS SubValue2, c.SubDesc AS SubDesc2
FROM MainTable AS a
LEFT JOIN SubValues AS b ON a.MainID = b.MainID AND b.SubTypeID = 1
LEFT JOIN SubValues AS c ON a.MainID = c.MainID AND b.SubTypeID = 2
Off-hand, I can't see a way of getting the same results as that by using a simple WHERE clause to join the tables.
Also, the syntax commonly used in WHERE clauses to do left and right joins (*= and =*) is being phased out,
Oracle supports LEFT JOIN and RIGHT JOIN using their special join operator (+) (and SQL Server used to support *= and =* on join predicates, but no longer does). But a simple FULL JOIN can't be done with implicit joins alone:
SELECT f.title, a.first_name, a.last_name
FROM film f
FULL JOIN film_actor fa ON f.film_id = fa.film_id
FULL JOIN actor a ON fa.actor_id = a.actor_id
This produces all films and their actors including all the films without actor, as well as the actors without films. To emulate this with implicit joins only, you'd need unions.
-- Inner join part
SELECT f.title, a.first_name, a.last_name
FROM film f, film_actor fa, actor a
WHERE f.film_id = fa.film_id
AND fa.actor_id = a.actor_id
-- Left join part
UNION ALL
SELECT f.title, null, null
FROM film f
WHERE NOT EXISTS (
SELECT 1
FROM film_actor fa
WHERE fa.film_id = f.film_id
)
-- Right join part
UNION ALL
SELECT null, a.first_name, a.last_name
FROM actor a
WHERE NOT EXISTS (
SELECT 1
FROM film_actor fa
WHERE fa.actor_id = a.actor_id
)
This will quickly become very inefficient both syntactically as well as from a performance perspective.