SQL multi level joins - sql

I need to join at least 4 tables. Table A is an Association table that contains a guid for Table B and C, Parentguid (B), Childguid (C). Table D contains information just for table C.
I need the results to look like this.
B - C - D
Monitor - Computer Name - Active
So the main thing is to show all of B table, only C table that is connected to B, and only D table this is associated with C.
I suspect I will need sub joins ( ). I am still a novice, it makes sense in my head but I can't seem to make the code work. I have played with joins for the past 2 days.
FROM vHWDesktopMonitor mon -- [Symantec_CMDB2].[dbo].[ResourceAssociation]
join ResourceAssociation RM on mon._ResourceGuid = RM.ParentResourceGuid
full outer join vComputer comp on RM.ChildResourceGuid = comp.Guid
full outer join vAsset on RM.ChildResourceGuid = vAsset._ResourceGuid

FROM vHWDesktopMonitor A
FULL OUTER JOIN ResourceAssociation B
on A._ResourceGuid = B.ParentResourceGuid
LEFT JOIN vComputer C
on B.ChildResourceGuid = C.Guid
LEFT JOIN vAsset D
on C.ChildResourceGuid = D._ResourceGuid
So the above will return
All FROM A and ALL records from B (Full Outer between A, B)
Only records from C that are in B (LEFT Between B and C)
Only records from D that are in C (LEFT Between C and D)
However, if you apply any where clause limits, it could reduce records which otherwise would be kept due to the left or outer joins...
For example if A._ResourceGuid ='7' exists in A but isn't in B;
and you set where B._ResourceGuid ='7' then the A record would otherwise would be kept due to the full outer join would then be excluded (making the full outer join the same as an INNER JOIN)!
a Full outer would return data like this:
A B
7 7
2
3
if you add a where clause where B=7 then you may be expecting to get because of the full outer since you said return all records from both...
A B
7 7
2
But you would end up getting
A B
7 7
Because the where clause occurs AFTER the full outer and therefore reduces the A.2 record.
To compensate for this you either have to put teh limits on teh join before the full outer executes or handle it in a where clause (but this method is VERY messy and prone to error and performance issues)
So when using outer joins, you MUST put the limiting criteria on the JOIN itself like below..
FROM vHWDesktopMonitor A
FULL OUTER JOIN ResourceAssociation B
on A._ResourceGuid = B.ParentResourceGuid
and B._resourceGuid = '7'
LEFT JOIN vComputer C
on B.ChildResourceGuid = C.Guid
LEFT JOIN vAsset D
on C.ChildResourceGuid = D._ResourceGuid
You could also put it in the where clause but you must remember to account for all the outer joins on the table and include null values for the other (this is just messy and slow)
FROM vHWDesktopMonitor A
FULL OUTER JOIN ResourceAssociation B
on A._ResourceGuid = B.ParentResourceGuid
LEFT JOIN vComputer C
on B.ChildResourceGuid = C.Guid
LEFT JOIN vAsset D
on C.ChildResourceGuid = D._ResourceGuid
WHERE (A._ResourceGuid is null OR B.ParentResourceGuid ='7')

If I understand you correctly either of these should work"
FROM vHWDesktopMonitor mon -- [Symantec_CMDB2].[dbo].[ResourceAssociation]
left join ResourceAssociation RM on mon._ResourceGuid = RM.ParentResourceGuid
left join vComputer comp on RM.ChildResourceGuid = comp.Guid
left join vAsset on comp.Guid = vAsset._ResourceGuid
or
FROM vHWDesktopMonitor mon -- [Symantec_CMDB2].[dbo].[ResourceAssociation]
left join ResourceAssociation RM on mon._ResourceGuid = RM.ParentResourceGuid
left join (select [list fields here] from vComputer comp
join vAsset on comp.Guid = vAsset._ResourceGuid) comp2
on RM.ChildResourceGuid = comp2.Guid
this should get you all the records from vHWDesktopMonitor and teh asscoiated records from ResourceAssociation with nulls for any records in vHWDesktopMonitor but not in ResourceAssociation. Then you get al teh records in vComputer that are also in ResourceAssociation. Finally you get al teh records in vAsset that are in vComputer. as alawys when you are getting all the records in teh first table, tehere will be nulls inteh fileds from other tables if you do not have an associated record.
If this doesn't work, perhaps you need to show us some sample data and expected results.

Related

SQL include entries with no data

Here is my sql statement in Oracle that I am working with:
SELECT FACILITY.fac_name
, SUM(FEE_LOG.fee_amount) AS TOTAL_FEES
FROM FACILITY
, BOOK_DETAIL
, TRANS_LOG
, FEE_LOG
, TRANS_TYPE
WHERE
FACILITY.fac_id = BOOK_DETAIL.fac_id
AND BOOK_DETAIL.bkdt_id = TRANS_LOG.bkdt_id
AND TRANS_LOG.trans_id = FEE_LOG.trans_id
AND TRANS_LOG.trans_type_id = TRANS_TYPE.trans_type_id
AND
(
TRANS_TYPE.trans_type_desc = 'LOST'
OR TRANS_TYPE.trans_type_desc = 'CHECKIN'
)
GROUP BY FACILITY.fac_name;
It outputs something similar to this:
FACILITY TOTAL_FEES
Facility 1 8.45
Facility 2 4.23
Facility 3 5.23
I have 2 other facilities but they do not have any fees associated with them. I want to show them as 0
So the output would be like:
FACILITY TOTAL_FEES
Facility 1 8.45
Facility 2 4.23
Facility 3 5.23
Facility 4 0
Facility 5 0
ER Diagram
Use LEFT JOIN instead of implicit INNER JOIN for the FEE_LOG table
SELECT FACILITY.fac_name
, SUM(FEE_LOG.fee_amount) AS TOTAL_FEES
FROM FACILITY
JOIN BOOK_DETAIL ON FACILITY.fac_id = BOOK_DETAIL.fac_id
JOIN TRANS_LOG ON BOOK_DETAIL.bkdt_id = TRANS_LOG.bkdt_id
LEFT JOIN FEE_LOG ON TRANS_LOG.trans_id = FEE_LOG.trans_id
JOIN TRANS_TYPE TRANS_LOG.trans_type_id = TRANS_TYPE.trans_type_id
WHERE TRANS_TYPE.trans_type_desc IN ('LOST', 'CHECKIN')
GROUP BY FACILITY.fac_name;
Use outer joins and the on clause:
SELECT FACILITY.fac_name, SUM(nvl(FEE_LOG.fee_amount,0)) AS TOTAL_FEES
FROM FACILITY
left join BOOK_DETAIL
on FACILITY.fac_id = BOOK_DETAIL.fac_id
left join TRANS_LOG
on BOOK_DETAIL.bkdt_id = TRANS_LOG.bkdt_id
left join FEE_LOG
on TRANS_LOG.trans_id = FEE_LOG.trans_id
left join TRANS_TYPE
on TRANS_LOG.trans_type_id = TRANS_TYPE.trans_type_id
and TRANS_TYPE.trans_type_desc in ('LOST', 'CHECKIN')
GROUP BY FACILITY.fac_name;
Note: If these facilities you are referring to, that do not have any fees, have rows on BOOK_DETAIL or TRANS_LOG you can replace the outer join with an inner join (for just those tables). Any table at which point there may or may not be a related record you have to use an outer join.
Try this:
SELECT f.fac_name
, coalesce(SUM(fl.fee_amount),0) AS TOTAL_FEES
FROM FACILITY AS f
INNER JOIN BOOK_DETAIL AS bd
ON bd.fac_id = f.fac_id
INNER JOIN TRANS_LOG AS tl
ON tl.bkdt_id = bd.bkdt_id
LEFT OUTER JOIN FEE_LOG AS fl
ON fl.trans_id = tl.trans_id
INNER JOIN TRANS_TYPE AS tt
ON tt.trans_type_id = tl.trans_type_id
WHERE tt.trans_type_desc in ('LOST' , 'CHECKIN')
GROUP BY f.fac_name;
NB: when you list tables after from with commas between them you're effectively using a cross join.
When you add a condition to your where statement matching a field from one table to a field from another, the join becomes an inner join.
To select all records from one table along with any matches which may or may not exist from another you need an outer join.
There are 3 types:
left outer join where you want all records from the first table listed and any matching records from the second.
right outer join for all records from the second with only matches from the first.
full outer join is all records from both tables, alongside one another where there's a match.
You should always specify the type of join rather than implying it through where clauses / commas, as this makes your intentions clearer and thus your code more readable.

SQL: Proper JOIN Protocol

I have the following tables with the following attributes:
Op(OpNo, OpName, Date)
OpConvert(OpNo, M_OpNo, Source_ID, Date)
Source(Source_ID, Source_Name, Date)
Fleet(OpNo, S_No, Date)
I have the current multiple JOIN query which gives me the results that I want:
SELECT O.OpNo AS Op_NO, O.OpName, O.Date AS Date_Entered, C.*
FROM Op O
LEFT OUTER JOIN OpConvert C
ON O.OpNo = C.OpNo
LEFT OUTER JOIN Source D
ON C.Source_ID = D.Source_ID
WHERE C.OpNo IS NOT NULL
The problem is this. I need to join the Fleet table on the previous multiple JOIN statement to attach the relevant S_No to the multiple JOIN table. Would I still be able to accomplish this using a LEFT OUTER JOIN or would I have to use a different JOIN statement? Also, which table would I JOIN on?
Please note that I am only familiar with LEFT OUTER JOINS.
Thanks.
I guess in your case you could use INNER JOIN or LEFT JOIN (which is the same thing as LEFT OUTER JOIN in SQL Server.
INNER JOIN means that it will only return records from other tables only if there are corresponding records (based on the join condition) in the Fleet table.
LEFT JOIN means that it will return records from other tables even if there are no corresponding records (based on the join condition) in the Fleet table. All columns from Fleet will return NULL in this case.
As for which table to join, you should really join the table that makes more logical sense based on your data structure.
Yes, you can use all tables mentioned before in your join conditions. Actually, JOINS (no matter of INNER, LEFT OUTER, RIGHT OUTER, CROSS, FULL OUTER or whatever) are left- associative, i. e. they are implicitly evaluated as if they would have been included in parentheses from the left as follows:
FROM ( ( ( Op O
LEFT OUTER JOIN OpConvert C
ON O.OpNo = C.OpNo
)
LEFT OUTER JOIN Source D
ON C.Source_ID = D.Source_ID
)
LEFT OUTER JOIN Fleet
ON ...
)
This is similar to how + or - would implicitly use parentheses, i. e.
2 + 3 - 4 - 5
is evaluated as
(((2 + 3) - 4) - 5)
By the way: If you use C.OpNo IS NOT NULL, then the LEFT OUTER JOIN Source D is treated as if it were an INNER JOIN, as you are explicitly removing all the "OUTER" rows.

When I add a LEFT OUTER JOIN, the query returns only a few rows

The original query returns 160k rows. When I add the LEFT OUTER JOIN:
LEFT OUTER JOIN Table_Z Z WITH (NOLOCK) ON A.Id = Z.Id
the query returns only 150 rows. I'm not sure what I'm doing wrong.
All I need to do is add a column to the query, which will bring back a code from a different table. The code could be a number or a NULL. I still have to display NULL, hence the reason for the LEFT join. They should join on the "id" columns.
SELECT <lots of stuff> + the new column that I need (called "code").
FROM
dbo.Table_A A WITH (NOLOCK)
INNER JOIN
dbo.Table_B B WITH (NOLOCK) ON A.Id = B.Id AND A.version = B.version
--this is where I added the LEFT OUTER JOIN. with it, the query returns 150 rows, without it, 160k rows.
LEFT OUTER JOIN
Table_Z Z WITH (NOLOCK) ON A.Id = Z.Id
LEFT OUTER JOIN
Table_E E WITH (NOLOCK) ON A.agent = E.agent
LEFT OUTER JOIN
Table_D D WITH (NOLOCK) ON E.location = D.location
AND E.type = 'Organization'
AND D.af_type = 'agent_location'
INNER JOIN
(SELECT X , MAX(Version) AS MaxVersion
FROM LocalTable WITH (NOLOCK)
GROUP BY agemt) P ON E.agent = P.location AND E.Version = P.MaxVersion
Does anyone have any idea what could be causing the issue?
When you perform a LEFT OUTER JOIN between tables A and E, you are maintaining your original set of data from A. That is to say, there is no data, or lack of data, in table E that can reduce the number of rows in your query.
However, when you then perform an INNER JOIN between E and P at the bottom, you are indeed opening yourself up to the possibility of reducing the number of rows returned. This will treat your subsequent LEFT OUTER JOINs like INNER JOINs.
Now, without your exact schema and a set of data to test against, this may or may not be the exact issue you are experiencing. Still, as a general rule, always put your INNER JOINs before your OUTER JOINs. It can make writing queries like this much, much easier. Your most restrictive joins come first, and then you won't have to worry about breaking any of your outer joins later on.
As a quick fix, try changing your last join to P to a LEFT OUTER JOIN, just to see if the Z join works.
You have to be very careful once you start with LEFT JOINs.
Let's suppose this model: You have tables Products, Orders and Customers. Not all products necessarily have been ordered, but every order must have customer entered.
Task: Show all products, and if the product was ordered, list the ordering customers; i.e., product without orders will be shown as one row, product with 10 orders will have 10 rows in the resultset. This calls for a query designed around FROM Products LEFT JOIN Orders.
Now someone could think "OK, Customer is always entered into orders, so I can make inner join from orders to customers". Wrong. Since the table Customers is joined through left-joined table Orders, it has to be left-joined itself... otherwise the inner join will propagate into the previous level(s) and as a result, you will lose all products that have no orders.
That is, once you join any table using LEFT JOIN, any subsequent tables that are joined through this table, need to keep LEFT JOINs. But it does not mean that once you use LEFT JOIN, all joins have to be of that type... only those that are dependent on the first performed LEFT JOIN. It would be perfectly fine to INNER JOIN the table Products with another table Category for example, if you only want to see Products which have a category set.
(Answer is based on this answer: http://www.sqlservercentral.com/Forums/Topic247971-8-1.aspx -> last entry)

Sybase 12 LEFT JOIN Performance issue for CrystalReports

I am trying to optimize a Crystal Report that is used very frequently here. I succeeded to optimize lots of queries but I still have one last bottleneck: This is the main query, generated from the report.
SELECT
A.*,
B.*,
C.*,
D.*,
E."N",
F."N",
G."N"
FROM
A
LEFT OUTER JOIN B ON
A."PK" = B."FK"
LEFT OUTER JOIN C ON
A."PK" = C."FK"
LEFT OUTER JOIN D ON
A."FK" = D."PK"
LEFT OUTER JOIN E ON
A."PK" = E."FK"
LEFT OUTER JOIN F ON
A."PK" = F."FK"
LEFT OUTER JOIN G ON
A."PK" = G."FK"
WHERE A.PK = ####
A,B,C and D are tables. E,F,G are simple views.
As you see, the report generated multiple LEFT JOINS. This query takes 2.28 seconds to complete (From the Plan Viewer stats). I identified three joins that seem problematic. If I remove E,F,G from the query, it becomes almost instant (0.0009s from the same stats)
SELECT
A.*,
B.*,
C.*,
D.*
FROM
A
LEFT OUTER JOIN B ON
A."PK" = B."FK"
LEFT OUTER JOIN C ON
A."PK" = C."FK"
LEFT OUTER JOIN D ON
A."FK" = D."PK"
WHERE A.PK = ####
I tought it might be the views that are slow, but if I do for example ...
SELECT *
FROM E
WHERE E.FK = ####
... it is also almost instant (0.0009s)
Tables all have indexes on PKs-FKs.
Views E,F,G all return one or no row with [FK|N] as columns, so the resulting column is NULL or a number.
Do you know how I could make this query fast?
PS: If I replace LEFT OUTER JOINS by INNER JOINS the main query becomes fast... :-/
Or trying to split this query into multiple queries on the report would be a better solution?
Thank you!
I would create functions for the lookup against E, F and G instead of joining them.
That way there is little chance the optimiser gets confused and tries to do stupid things.
SELECT
A.*,
B.*,
C.*,
D.*,
GET_E(A."PK"),
GET_F(A."PK"),
GET_G(A."PK")
FROM
A
LEFT OUTER JOIN B ON
A."PK" = B."FK"
LEFT OUTER JOIN C ON
A."PK" = C."FK"
LEFT OUTER JOIN D ON
A."FK" = D."PK"
WHERE A.PK = ####
The problem is probably because you are creating a huge cartesian product of 5 tables all joined to A in some way (A and D will only contribute one record to the product). Having such a big cartesian product will consume quite a bit of memory internally in Sybase. It is likely that your query is just wrong.

SQL joining three tables, join precedence

I have three tables: R, S and P.
Table R Joins with S through a foreign key; there should be at least one record in S, so I can JOIN:
SELECT
*
FROM
R
JOIN S ON (S.id = R.fks)
If there's no record in S then I get no rows, that's fine.
Then table S joins with P, where records is P may or may not be present and joined with S.
So I do
SELECT
*
FROM
R
JOIN S ON (S.id = R.fks)
LEFT JOIN P ON (P.id = S.fkp)
What if I wanted the second JOIN to be tied to S not to R, like if I could use parentheses:
SELECT
*
FROM
R
JOIN (S ON (S.id = R.fks) JOIN P ON (P.id = S.fkp))
Or is that already a natural behaviour of the cartesian product between R, S and P?
All kinds of outer and normal joins are in the same precedence class and operators take effect left-to-right at a given nesting level of the query. You can put the join expression on the right side in parentheses to cause it to take effect first. Remember that you will have to move the ON clauses around so that they stay with their joins—the join in parentheses takes its ON clause with it into the parentheses, so it now comes textually before the other ON clause which will be after the parentheses in the outer join statement.
(PostgreSQL example)
In
SELECT * FROM a LEFT JOIN b ON (a.id = b.id) JOIN c ON (b.ref = c.id);
the a-b join takes effect first, but we can force the b-c join to take effect first by putting it in parentheses, which looks like:
SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
Often you can express the same thing without extra parentheses by moving the joins around and changing the direction of the outer joins, e.g.
SELECT * FROM b JOIN c ON (b.ref = c.id) RIGHT JOIN a ON (a.id = b.id);
When you join the third table, your first query
SELECT
*
FROM
R
JOIN S ON (S.id = R.fks)
is like a derived table to which you're joining the third table. So if R JOIN S produces no rows, then joining P will never yield any rows (because you're trying to join to an empty table).
So, if you're looking for precedence rules then in this case it's just set by using LEFT JOIN as opposed to JOIN.
However, I may be misunderstanding your question, because if I were writing the query, I would swap S and R around. eg.
SELECT
*
FROM
S
JOIN R ON (S.id = R.fks)
The second join is tied to S as you explicity state JOIN P ON (P.id = S.fkp) - no column from R is referenced in the join.
with a as (select 1 as test union select 2)
select * from a left join
a as b on a.test=b.test and b.test=1 inner join
a as c on b.test=c.test
go
with a as (select 1 as test union select 2)
select * from a inner join
a as b on a.test=b.test right join
a as c on b.test=c.test and b.test=1
Ideally, we would hope that the above two queries are the same. However, they are not - so anybody that says a right join can be replaced with a left join in all cases is wrong. Only by using the right join can we get the required result.