What's "wrong" with my DQL query? - sql

I have the following SQL query:
select bank.*
from bank
join branch on branch.bank_id = bank.id
join account a on a.branch_id = branch.id
join import i on a.import_id = i.id
It returns exactly what I expect.
Now consider the following two DQL queries:
$q = Doctrine_Query::create()
->select('Bank.*')
->from('Bank')
->leftJoin('Branch')
->leftJoin('Account')
->leftJoin('Import');
-
$q = Doctrine_Query::create()
->select('Bank.*')
->from('Bank')
->innerJoin('Branch')
->innerJoin('Account')
->innerJoin('Import');
It would have been nice to have been able to use a "join()" method but, from the official Doctrine join documentation here, it says, "DQL supports two kinds of joins INNER JOINs and LEFT JOINs." For some reason that thoroughly escapes me, they chose not to support natural joins. Anyway, what that means is that the two queries above are my only options. Well, that's unfortunate because neither of them work.
The first query - the one with left joins - doesn't work because, of course, a left join and a natural join are two different things.
The second query doesn't work, either. Bafflingly, I get an error: "Unknown relation alias."
Why should Doctrine be able to figure out the alias for a LEFT JOIN but not for an INNER JOIN?
By the way, I realize INNER JOIN and JOIN are only nominally different but why implement the more specific one and not the canonical, natural one?

->select('b.*')
->from('Bank b')
->leftJoin('b.Branch h')
->select('b.*')
->from('Bank b')
->innerJoin('b.Branch h')
http://www.doctrine-project.org/documentation/manual/1_1/en/dql-doctrine-query-language:join-syntax

Related

What is this type of join called where the joins are qualified at the end?

I'm working with some views in an old HR system that frequently puts a number of joins together with the 'ON' part of the joins not coming until the end. Why would someone do it in this way and is there some advantage to it? I find it quite confusing when there is a large number of joins in it. I couldn't quite describe the situation in a web search well enough to help me.
SELECT
ExportTypeIdNo = et.ExportTypeIdNo,
ExportDataStoredProcedureIdNo = et.ExportDataStoredProcedureIdNo,
ExportDataStoredProcedure = sp1.Name,
ExportFileStoredProcedureIdNo = et.ExportFileStoredProcedureIdNo,
ExportFileStoredProcedure = sp2.Name
FROM tSTORED_PROCEDURES sp2 INNER JOIN
(tSTORED_PROCEDURES sp1 INNER JOIN
(tEXPORT_TYPES et INNER JOIN tTYPE_CODES tc
ON et.ReportingTreeIdNo = tc.TypeCodeIdNo)
ON sp1.StoredProcedureIdNo = et.ExportDataStoredProcedureIdNo)
ON sp2.StoredProcedureIdNo = et.ExportFileStoredProcedureIdNo
This code is just an example of what this type of join looks like.
There is no good reason to do this. With inner join, any ordering of the joins is equivalent. Almost everyone would agree that the following is simpler to follow and maintain:
FROM tEXPORT_TYPES et INNER JOIN
tTYPE_CODES tc
ON et.ReportingTreeIdNo = tc.TypeCodeIdNo INNER JOIN
tSTORED_PROCEDURES sp1
ON sp1.StoredProcedureIdNo = et.ExportDataStoredProcedureIdNo INNER JOIN
tSTORED_PROCEDURES sp2
ON sp2.StoredProcedureIdNo = et.ExportFileStoredProcedureIdNo
There are some arcane situations where rearranging outer joins is not exactly equivalent. In general, though, inner joins followed by left joins is sufficient for almost all the queries that I write.
As to why someone would write the joins this way? I can only speculate. My most likely reason is that they were using ,s in the from clause and simply grew to prefer having all the tables referenced before the conditions connecting them.

Explanation of SQL ON

I have been googled some for an explanation of the SQL function ON though
I couldn't find a good explanation how it work.
is it associated/connected to INNER JOIN?
Could someone please explain my Code-snippet what really happens?
(see my code below)
SELECT
TS_TEST_ID as Test_ID,
TS_NAME as Name
FROM TEST
INNER JOIN DESSTEPS
ON TEST.TS_TEST_ID = DESSTEPS.DS_TEST_ID
INNER JOIN ALL_LISTS
ON ALL_LISTS.AL_ITEM_ID = TEST.TS_SUBJECT
It is not a function, it is part of language. Like with natural language you have various types of words: like nouns, verbs etc. This is like proposition.
ON is a part of syntax for INNER JOIN, it goes like this:
one table INNER JOIN some other table ON how do I want to join both tables (key columns)
You might find some more details here
on tells the join with which condition the tables should be connected.
In this case:
FROM TEST
INNER JOIN DESSTEPS
ON TEST.TS_TEST_ID = DESSTEPS.DS_TEST_ID
You tables test will be joined on column TS_TEST_ID and DS_TEST_ID. So records belong together, where These id's are equals

outer joins models that are not associations

I have the following SQL I want to create with activerecord. My problem is that I am stuck in a logic loop where I can't LEFT OUTER JOIN a table which has yet to be joined, and I can't find my entry point to the join fiasco
in activerecord I am trying to do
AdMsgs.joins("LEFT OUTER JOIN shows ON ad_msgs.user_id = shows.id OR ad_msgs.user_id = shows.b_id ")
.joins("LEFT OUTER JOIN m ON m.user_id = users.id OR m.m_id = shops.id OR m.m_id = shows.b_id")
.joins("LEFT OUTER JOIN users ON ad_msgs.to = users.email OR ad_msgs.user_id = users.id OR users.id = m.user_id")
.where("shows.id = ?", self.id)
.distinct("ad_msgs.id")
the query outputs an error saying it doesn't know what users is on the second join (probably since I haven't joined it yet) but I need to select the m records according the the users
AdMsgs doesn't have an association with neither of the tables.
Is there a way to full outer join these 3 tables and then select the ones relevant (or any better ways?)
use find_by_sql to implement such scenarios.
otherwise as a rule of thumb, if you can't use joins like this Blog.joins(articles: :comments) you are probably doing something bad or use find_by_sql instead.
in a complex case, i'm writing my query in SQL first to verify the logic involved. often times it's trivial to replace one complex query with 2 simple ones (using IN(*ids)).

SQL INNER JOIN implemented as implicit JOIN

Recently, I came across an SQL query which looked like this:
SELECT * FROM A, B WHERE A.NUM = B.NUM
To me, it seems as if this will return exactly the same as an INNER JOIN:
SELECT * FROM A INNER JOIN B ON A.NUM = B.NUM
Is there any sane reason why anyone would use a CROSS JOIN here? Edit: it seems as if most SQL applications will automatically use a INNER JOIN here.
The database is HSQLDB
The older syntax is a SQL antipattern. It should be replaced with an inner join anytime you see it. Part of why it is an antipattern is because it is impoosible to tell if a cross join was intended or not if the where clasues is ommitted. This causes many accidental cross joins espcially in complex queries. Further, in some databases (espcially Sql server) the implict outer joins do not work correctly and so people try to combine explicit and implict joins and get bad results without even realizing it. All in all it is a poor practice to even consider using an implict join.
Yes, your both statements will return the same result. Which one is to be used is a matter of taste. Every sane database system will use a join for both if possible, no sane optimizer will really use a cross product in the first case.
But note that your first syntax is not a cross join. It is just an implicit notation for a join which does not specify which kind of join to use. Instead, the optimizer must check the WHERE clauses to determine whether to use an inner join or a cross join: If an applicable join condition is found in the WHERE clause, this will result in an inner join. If no such clause is found it will result in a cross join. Since your first example specifies an applicable join condition (WHERE A.NUM = B.NUM) this results in an INNER JOIN and thus exactly equivalent to your second case.

Queries that implicit SQL joins can't do?

I've never learned how joins work but just using select and the where clause has been sufficient for all the queries I've done. Are there cases where I can't get the right results using the WHERE clause and I have to use a JOIN? If so, could someone please provide examples? Thanks.
Implicit joins are more than 20 years out-of-date. Why would you even consider writing code with them?
Yes, they can create problems that explicit joins don't have. Speaking about SQL Server, the left and right join implicit syntaxes are not guaranteed to return the correct results. Sometimes, they return a cross join instead of an outer join. This is a bad thing. This was true even back to SQL Server 2000 at least, and they are being phased out, so using them is an all around poor practice.
The other problem with implicit joins is that it is easy to accidentally do a cross join by forgetting one of the where conditions, especially when you are joining too many tables. By using explicit joins, you will get a syntax error if you forget to put in a join condition and a cross join must be explicitly specified as such. Again, this results in queries that return incorrect values or are fixed by using distinct to get rid of the cross join which is inefficient at best.
Moreover, if you have a cross join, the maintenance developer who comes along in a year to make a change doesn't know if it was intended or not when you use implicit joins.
I believe some ORMs also now require explicit joins.
Further, if you are using implied joins because you don't understand how joins operate, chances are high that you are writing code that, in fact, does not return the correct result because you don't know how to evaluate what the correct result would be since you don't understand what a join is meant to do.
If you write SQL code of any flavor, there is no excuse for not thoroughly understanding joins.
Yes. When doing outer joins. You can read this simple article on joins. Joins are not hard to understand at all so you should start learning (and using them where appropriate) right away.
Are there cases where I can't get the right results using the WHERE clause and I have to use a JOIN?
Any time your query involves two or more tables, a join is being used. This link is great for showing the differences in joins with pictures as well as sample result sets.
If the join criteria is in the WHERE clause, then the ANSI-89 JOIN syntax is being used. The reason for the newer JOIN syntax in the ANSI-92 format, is that it made LEFT JOIN more consistent across various databases. For example, Oracle used (+) on the side that was optional while in SQL Server you had to use =*.
Implicit join syntax by default uses Inner joins. It is sometimes possible to modify the implicit join syntax to specify outer joins, but it is vendor dependent in my experience (i know oracle has the (-) and (+) notation, and I believe sqlserver uses *= ). So, I believe your question can be boiled down to understanding the differences between inner and outer joins.
We can look at a simple example for an inner vs outer join using a simple query..........
The implicit INNER join:
select a.*, b.*
from table a, table b
where a.id = b.id;
The above query will bring back ONLY rows where the 'a' row has a matching row in 'b' for it's 'id' field.
The explicit OUTER JOIN:
select * from
table a LEFT OUTER JOIN table b
on a.id = b.id;
The above query will bring back EVERY row in a, whether or not it has a matching row in 'b'. If no match exists for 'b', the 'b' fields will be null.
In this case, if you wanted to bring back EVERY row in 'a' regardless of whether it had a corresponding 'b' row, you would need to use the outer join.
Like I said, depending on your database vendor, you may still be able to use the implicit join syntax and specify an outer join type. However, this ties you to that vendor. Also, any developers not familiar wit that specialized syntax may have difficulty understanding your query.
Any time you want to combine the results of two tables you'll need to join them. Take for example:
Users table:
ID
FirstName
LastName
UserName
Password
and Addresses table:
ID
UserID
AddressType (residential, business, shipping, billing, etc)
Line1
Line2
City
State
Zip
where a single user could have his home AND his business address listed (or a shipping AND a billing address), or no address at all. Using a simple WHERE clause won't fetch a user with no addresses because the addresses are in a different table. In order to fetch a user's addresses now, you'll need to do a join as:
SELECT *
FROM Users
LEFT OUTER JOIN Addresses
ON Users.ID = Addresses.UserID
WHERE Users.UserName = "foo"
See http://www.w3schools.com/Sql/sql_join.asp for a little more in depth definition of the different joins and how they work.
Using Joins :
SELECT a.MainID, b.SubValue AS SubValue1, b.SubDesc AS SubDesc1, c.SubValue AS SubValue2, c.SubDesc AS SubDesc2
FROM MainTable AS a
LEFT JOIN SubValues AS b ON a.MainID = b.MainID AND b.SubTypeID = 1
LEFT JOIN SubValues AS c ON a.MainID = c.MainID AND b.SubTypeID = 2
Off-hand, I can't see a way of getting the same results as that by using a simple WHERE clause to join the tables.
Also, the syntax commonly used in WHERE clauses to do left and right joins (*= and =*) is being phased out,
Oracle supports LEFT JOIN and RIGHT JOIN using their special join operator (+) (and SQL Server used to support *= and =* on join predicates, but no longer does). But a simple FULL JOIN can't be done with implicit joins alone:
SELECT f.title, a.first_name, a.last_name
FROM film f
FULL JOIN film_actor fa ON f.film_id = fa.film_id
FULL JOIN actor a ON fa.actor_id = a.actor_id
This produces all films and their actors including all the films without actor, as well as the actors without films. To emulate this with implicit joins only, you'd need unions.
-- Inner join part
SELECT f.title, a.first_name, a.last_name
FROM film f, film_actor fa, actor a
WHERE f.film_id = fa.film_id
AND fa.actor_id = a.actor_id
-- Left join part
UNION ALL
SELECT f.title, null, null
FROM film f
WHERE NOT EXISTS (
SELECT 1
FROM film_actor fa
WHERE fa.film_id = f.film_id
)
-- Right join part
UNION ALL
SELECT null, a.first_name, a.last_name
FROM actor a
WHERE NOT EXISTS (
SELECT 1
FROM film_actor fa
WHERE fa.actor_id = a.actor_id
)
This will quickly become very inefficient both syntactically as well as from a performance perspective.