Same ID field on multiple tables - sql

So I've been given a task where i should present the following:
ID, LNAME, FNAME, MNAME, BIRTH_DATE, RELG_CODE, NAT_CODE, PT_STATUS, RM_NO, DTTM_ADM
The tables are:
HISR_CODES, PASR_NAMES, PASR_PROFILE, PAST_PATIENT_ADM
--Viewing them using DESC--
So while I viewed them, I was told that the ID on these tables are the same. So what I did so far in the coding (I'll finish the rest but I need to make sure this works first):
SELECT
A.ID,
A.LNAME,
A.FNAME,
A.MNAME,
A.BIRTH_DATE,
C.RELG_CODE,
C.NAT_CODE,
B.PT_STATUS,
B.RM_NO,
B.DTTM_ADM
FROM
PASR_NAMES A,
PASR_PROFILE B,
PAST_PATIENT_ADM C,
HISR_CODES D
WHERE
A.ID = B.ID
AND
B.ID = C.ID
AND
C.ID = D.ID
Is there a way to tell that all of the ID's from the tables are the same? A simpler code than going on like this:
WHERE
A.ID = B.ID
AND
B.ID = C.ID
AND
C.ID = D.ID
Or is JOIN - ON the only option for this?

You can use NATURAL JOIN as below:
SELECT
A.ID,
A.LNAME,
A.FNAME,
A.MNAME,
A.BIRTH_DATE,
C.RELG_CODE,
C.NAT_CODE,
B.PT_STATUS,
B.RM_NO,
B.DTTM_ADM
FROM
PASR_NAMES A
NATURAL JOIN PASR_PROFILE B
NATURAL JOIN PAST_PATIENT_ADM C
NATURAL JOIN HISR_CODES D;
From Oracle Reference, "A natural join is based on all columns in the two tables that have the same name." So, there is a chance that the joins happen based on other columns as well. Therefore, it is recommended that you still use the INNER JOIN syntax and explicitly specify the JOIN columns.
References:
NATURAL JOIN on Oracle® Database SQL Language Reference
Related SO question

Use the proper join syntax:
FROM PASR_NAMES A JOIN
PASR_PROFILE B
ON A.ID = B.ID JOIN
PAST_PATIENT_ADM C
ON B.ID = C.ID JOIn
HISR_CODES D
ON C.ID = D.ID
Or:
FROM PASR_NAMES A JOIN
PASR_PROFILE B
USING (ID) JOIN
PAST_PATIENT_ADM C
USING (ID)
HISR_CODES D
USING (ID)
I would discourage you from using the natural join. It might seem like the right thing at first. However, the semantics of the query are highly dependent on the structure of the tables. If columns are renamed, removed, or added, the query might still work but produce highly unexpected results.

Related

Strange / esoteric join syntax

I've been provided this old SQL code (table names changed) to replicate, and the JOIN syntax isn't something I've seen before, and is proving hard to google:
select <stuff>
from A
inner join B
on A.ID = B.A_ID
inner join C -- eh? No ON?
inner join D
ON C.C_ID = D.C_ID
ON B.C_ID = D.C_ID -- a second ON here? what?
When I saw the code, I assumed I'd be sent broken code and it wouldn't run.
But it does. (Sql Server 2012)
What does it do? Is there a more sensible / standard way of writing it? What's happening here?
While unusual, this is perfectly valid tsql. Typically you see this approach when you have an outer join to a set of related tables which are inner joined to one another. A better IMHO way to write this would be:
inner join B
on A.ID = B.A_ID
inner join (C inner join D ON C.C_ID = D.C_ID)
ON B.C_ID = D.C_ID
This makes the join logic clear - and it also helps the reader. Additionally, it lets the reader know that the developer did this intentionally. So let this be an example of poor coding. Comment things that are unusual. Explain things. Have someone review your code periodically to help improve your style and usage.
And you could write this in a "typical" style by rearranging the order of tables in the from clause - but I'll guess that the current version makes more logical sense with the real table names.
I ran it by a colleague who figured it out:
select <stuff>
from A
inner join B
on A.ID = B.A_ID
inner join ( C -- put a bracket here...
inner join D
ON C.C_ID = D.C_ID
) -- and one here
ON B.C_ID = D.C_ID
or to format it a little nicer:
select <stuff>
from A
inner join B
on A.ID = B.A_ID
inner join (
C
inner join D
ON C.C_ID = D.C_ID
)
ON B.C_ID = D.C_ID
I wasn't familiar with this kind of "sub-join" (I don't know what it's called), but this is much more readable and clear

SQL - Consecutive "ON" Statements

As I was cleaning up some issues in an old view in our database I came across this "strange" join condition:
from
tblEmails [e]
join tblPersonEmails [pe]
on (e.EmailID = pe.EmailID)
right outer join tblUserAccounts [ua]
join People [p]
on (ua.PersonID = p.Id)
join tblChainEmployees [ce]
on (ua.PersonID = ce.PersonID)
on (pe.PersonID = p.Id)
Table tblUserAccounts is referenced as a right outer join, but the on condition for it is not declared until after tblChainEmployees is referenced; then there are two consecutive on statements in a row.
I couldn't find a relevant answer anywhere on the Internet, because I didn't know what this kind of join is called.
So the questions:
Does this kind of "deferred conditional" join have a name?
How can this be rewritten to produce the same result set where the on statements are not consecutive?
Maybe this is a "clever" solution when there has always been a simpler/clearer way?
(1) This is just syntax and I've never heard of some special name. If you read carefully this MSDN article you'll see that (LEFT|RIGHT) JOIN has to be paired with ON statement. If it's not, expression inside is parsed as <table_source>. You can put parentheses to make it more readable:
from
tblEmails [e]
join tblPersonEmails [pe]
on (e.EmailID = pe.EmailID)
right outer join
(
tblUserAccounts [ua]
join People [p]
on (ua.PersonID = p.Id)
join tblChainEmployees [ce]
on (ua.PersonID = ce.PersonID)
) on (pe.PersonID = p.Id)
(2) I would prefer LEFT syntax, with explicit parentheses (I know, it's a matter of taste). This produces the same execution plan:
FROM tblUserAccounts ua
JOIN People p ON ua.PersonID = p.Id
JOIN tblChainEmployees ce ON ua.PersonID = ce.PersonID
LEFT JOIN
(
tblEmails e
JOIN tblPersonEmails pe ON e.EmailID = pe.EmailID
) ON pe.PersonID = p.Id
(3) Yes, it's clever, just like some C++ expressions (i.e. (i++)*(*t)[0]<<p->a) on interviews. Language is flexible. Expressions and queries can be tricky, but some 'arrangements' lead to readability degradation and errors.
Looks to me like you have tblEmail and tblPerson with their own independent IDs, emailID and ID (person), a linking table tblPersonEmail with the valid pairs of emailID/IDs, and then the person table may have a 1-1 relationship with UserAccount, which may then have a 1-1 relationship with chainEmployee, so to get rid of the RIGHT OUTER JOIN in favor of LEFT, I'd use:
FROM
((tblPerson AS p INNER JOIN
(tblEmail AS e INNER JOIN
tblPersonEmail AS pe ON
e.emailID = pe.emailID) ON
p.ID = pe.personID) LEFT JOIN
tblUserAccount AS ua ON
p.ID = ua.personID) LEFT JOIN
tblChainEmployee AS ce ON
ua.personID = ce.personID
I can't think of a great practical example of this off the top of my head so I'll give you a generic example that hopefully makes sense. Unfortunately I'm not aware of a generic name for this either.
Many people will start off with a query like this:
select ...
from
A a left outer join
B b on b.id = a.id left outer join
C c on c.id2 = b.id2;
The look at the results and realize that they really need to eliminate the rows in B that don't have a corresponding C but if you tried to say where b.id2 is not null and c.id2 is not null you've defeated the whole purpose of the left join from A.
So next you try to do this but it doesn't take long to figure out it's not going to work. The inner join at the tail end of the chain has basically converted both the joins to inner joins.
select ...
from
A a left outer join
B b on b.id = a.id inner join
C c on c.id2 = b.id2;
The problem seems simple yet it doesn't work right. Essentially after you ponder for a while you discover that you need to control the join order and do the inner join first. So the three queries below are equivalent ways to accomplish that. The first one is probably the one you're more familiar with:
select ...
from
A a left outer join
(select * from B b inner join C c on c.id2 = b.id2) bc
on bc.id = a.id
select ...
from
A a left outer join
B b inner join
C c on c.id2 = b.id2
on b.id = a.id
select ...
from
B b inner join
C c on c.id2 = b.id2 right outer join -- now they can be done in order
A a on a.id = b.id
You query is a little more complicated but ultimately the same issues came into play which is where the odd stuff came from. SQL has evolved and you have to remember that platforms didn't always have the fancy things like derived tables, scalar subqueries, CTEs so sometimes people had to write things this way. And then there were graphical query builders with a lot of limitations in older versions of tools like Crystal Report that didn't allow for complex join conditions...

SQL Server query issue - ambiguous column

I have four tables :
Applicant (aid, aname)
entrance_test (Etid, etname)
etest_centre (etcid, location)
etest_details (aid, etid, etcid, etest_dt)
I want to select the number of applicants who have appeared for each test, test center wise.
This is my current query:
select
location, etname, count(Aid) as number of applicants
from
applicant as a
inner join
etest_details as d on a.aid = d.aid
inner join
Entrance_Test as t on t.Etid = d.Etid
inner join
Etest_Centre as c on c.Etcid = d.Etcid
group by
Location, Etname
This is the error I am getting :
Ambiguous column name 'Aid'
You have the column aid in multiple tables, and it doesn't know which to pick from. You should specify which table it is from using the aliases you defined.
In this case, since a.Aid is the same as d.Aid (due to the JOIN), I'm using the a alias, but do keep in mind if location and etname also appear in multiple tables, you need to specify which table it should pick from.
Select c.location, t.etname, Count(a.Aid)
From Applicant As a
Inner Join etest_details As d On a.aid = d.aid
Inner Join Entrance_Test As t On t.Etid = d.Etid
Inner Join Etest_Centre As c On c.Etcid = d.Etcid
Group By c.Location, t.Etname
As a rule of thumb, when you have multiple sources in one query, you should always be explicit about which table it should come from. Even if you're sure it only exists in one of them, it's a good habit to get into to avoid issues like this in the future.
You need to mention the alias in the COUNT clause. Since you are using aliases, it would be better if you use them in the SELECT and GROUP BY sections as well. In this case, it should be :
SELECT a.location,
a.etname,
COUNT(d.Aid)
FROM applicant AS a
INNER JOIN etest_details AS d ON a.aid = d.aid
INNER JOIN Entrance_Test AS t ON t.Etid = d.Etid
INNER JOIN Etest_Centre AS c ON c.Etcid = d.Etcid
GROUP BY a.Location,
a.Etname

Equivalent SQL Query using Cartesian Product Only

given 3 tables a : {id, name_eng}, b: {id, name_spa} and c: {id, name_ita}
which is the equivalent "product cartesian query" for this given one:
select
a.name_eng
b.name_spa
c.name_ita
from
a inner join b on a.id = b.id
left outer join c on a.id = c.id
I don't know what you want, a cartesian product would be this:
SELECT a.name_eng
b.name_spa
c.name_ita
FROM a
CROSS JOIN b
CROSS JOIN c
Or the implicit way:
SELECT a.name_eng
b.name_spa
c.name_ita
FROM a,b,c
If you want your previous query written with cartesian products (why??), then this should do (on SQL Server 2000):
SELECT a.name_eng
b.name_spa
c.name_ita
FROM a,b,c
WHERE a.id = b.id
AND a.id *= c.id
If I wasn't clear enough with the "why??", you shouldn't use implicit joins since hey are deprecated, you should always use proper explicit joins.

SQL SELECT and SUM from three

I have been cracking my head for hours on what I thought to be simple SQL SELECT command. I searched every where and read all questions related to mine. I tried an SQL Command Builder, and even read and applied complete series of SQL tutorials and manuals to try to build it from scratch understanding it (which is very important for me, regarding next commands I'll eventually have to build...).
But now I'm just stuck with the results I want, but on separates SELECT commands which I seem to be unable to get together !
Here is my case : 3 tables, first linked to the second with a common id, second linked to the third with another common id, but no common id from the first to the third. Let's say :
Table A : id, name
Table B : id, idA, amount
Table C : id, idB, amount
Several names in Table A. Several amounts in Table B. Several amounts in Table C. Result wanted : each A.id and A.name, with the corresponding SUM of B.amount, and with the corresponding SUM of C.amount. Let's say :
A.id
A.name
SUM(B.amount) WHERE B.idA = A.id
SUM(C.amount) WHERE C.idB = B.id for each B which B.idA = A.id
It's okay for "the first three columns", and "the first two columns and the fourth", both with a WHERE clause and/or a LEFT JOIN. But I can't achieve cumulating all fourth columns together without messing everything !
One could say "it's easy, just put an idA column in Table C" ! Should be easier, sure. But is it really necessary ? I don't think so, but I could be wrong ! So, I just please anyone (who I will give an eternal "SQL God" decoration) with SQL skills to answer laughing "That's so simple ! Just do that and you are gone ! Stupid little newbies..." ;)
Running VB 2010 and MS SQL Server
Thanks for reading !
Try this:
SELECT A.Id, A.Name, ISNULL(SUM(B.amount), 0) as bSum, ISNULL(SUM(C2.Amount), 0) as cSum
FROM A
LEFT OUTER JOIN B ON A.Id = B.idA
LEFT OUTER JOIN (SELECT C.idB, SUM(C.AMOUNT) AS Amount FROM C GROUP BY C.idB) AS C2 ON C2.idB = B.Id
GROUP BY A.Id, A.Name
Try this:
SELECT
a.id,
a.name,
sum(x.amount) as amountb,
sum(x.amountc) as amountc
from a
left join (
select
b.id,
b.ida,
b.amount,
SUM(c.amount) as amountc
from b
left join c
on b.id = c.idb
group by
b.id,
b.amount,
b.ida
) x
on a.id = x.ida
group by
a.id,
a.name
This should give you the result set you're looking for. It sums all C.Amount's for each B.id, then adds it all together into a single result set. I tested it with a bit of sample data in MSSQL, and it works as expected.
Select a.id, a.name, sum(b.amount), sum(c.amount)
from a inner join b on a.id = b.idA
inner join c on b.id = c.idB
group by a.id, a.name
You need to add them separately:
select a.id, a.name, (coalesce(b.amount, 0.0) + coalesce(c.amount, 0.0))
from a left outer join
(select b.ida, sum(amount) as amount
from b
group by b.ida
) b
on a.id = b.ida left outer join
(select b.ida, sum(amount) as amount
from c join
b
on c.idb = b.id
group by b.ida
) c
on a.id = c.ida
The outer joins are to take into account when b and c records don't both exist for a given id.