Entity Framework with Multiple Joins and Subquery - sql

I have a complex query with multiple left joins and subqueries which I need to implement in Entify Framework. I've received the monster SQL and
my goal is to do it on a elegant way with EF. The query consumes multiple tables and creates a "WITH" subquery on top which
is included in the joins later. I've done a first approach with EF but when I inspect the output that EF sends to the DB, inner joins are sent when
I am expecting LEFT JOINs.
A summary of the SQL follows:
WITH SUB_QUERY
AS ( SELECT FIELD_A,
FIELD_B,
FIELD_C,
MAX (FIELD_D) MAX_FIELD_D
FROM TABLE_X
WHERE SOME FIELD_A = 'WHATEVER'
GROUP BY FIELD_A, FIELD_B, FIELD_C)
SELECT C.FIELD_A,
C.FIELD_B,
B.FIELD_X,
D.FIELD_S,
E.FIELD_J,
F.FIELD_Y
FROM TABLE_A A
LEFT JOIN SUB_QUERY B
ON A.FIELD_C = B.FIELD_C
LEFT JOIN TABLE_C C
ON B.FIELD_A = C.FIELD_A
LEFT JOIN TABLE_D D
ON A.FIELD_C = D.FIELD_C
LEFT JOIN TABLE_E E
ON A.FIELD_X = E.FIELD_X
LEFT JOIN TABLE_F F
ON A.FIELD_W = F.FIELD_W
WHERE A.FIELD_H = D.FIELD_H
AND A.FIELD_D = B.MAX_FIELD_D
As you see, a subquery on top filters and groups some data to be consumed in a join below. Then all the joins take place
and some fields are taken from different tables as the output of the query.
Which approach would you recommend me to accomplish this task? I've tried different approaches and no one of them works (either retrieve nothing, or many more rows than the SQL query on the DB, etc..)
Please note that the Domain Model in Entity Framework is properly setup: Primary Keys, collections, nested objects etc.. so I believe some of these
joins are not even required because my EF entities contain already references to the child collections and parent objects (navigation properties).
Thanks a lot!!

if you really need a left join you should mode the where condition related to a left joined table in the proper on clause
FROM TABLE_A A
LEFT JOIN SUB_QUERY B
ON A.FIELD_C = B.FIELD_C
LEFT JOIN TABLE_C C
ON B.FIELD_A = C.FIELD_A
LEFT JOIN TABLE_D D
ON A.FIELD_C = D.FIELD_C AND A.FIELD_D = B.MAX_FIELD_D
LEFT JOIN TABLE_E E
ON A.FIELD_X = E.FIELD_X
LEFT JOIN TABLE_F F
ON A.FIELD_W = F.FIELD_W
the use of a left join table column in where force the relation to work as a INNER JOIN

Related

Is it possible to have multiple joins between two tables in stored procedure?

I have two tables, "Booking" and "City". CityName field is primary key in City table and I have used it as foreign key for two columns "SourceCity" and "DestinationCity" in Booking table. I want to create a stored procedure to select all existing data from the Booking table for creating a view list, for which I have written the following.
SELECT [dbo].[Booking].[BookingID],
[dbo].[Booking].[CustomerName],
[dbo].[City].[CityName],
[dbo].[City].[CityName],
[dbo].[Booking].[StartingDate],
[dbo].[Booking].[EndingDate],
[dbo].[Car].[LicensePlateNumber],
[dbo].[Driver].[DriverName],
[dbo].[Booking].[AdvanceTaken],
[dbo].[Booking].[PendingPayment],
[dbo].[Booking].[TotalRent],
[dbo].[Booking].[BookingDate],
[dbo].[Booking].[IDProof]
FROM [dbo].[Booking]
**LEFT OUTER JOIN [dbo].[City]
ON [dbo].[Booking].[SourceCity] = [dbo].[City].[CityName]
AND [dbo].[Booking].[DestinationCity] = [dbo].[City].[CityName]**
LEFT OUTER JOIN [dbo].[Driver]
ON [dbo].[Driver].[DriverID] = [dbo].[Booking].[DriverAllotted]
LEFT OUTER JOIN [dbo].[Car]
ON [dbo].[Car].[CarID] = [dbo].[Booking].[CarAllotted]
ORDER BY [dbo].[Booking].[BookingID]
I am not sure if it is possible to do the following
LEFT OUTER JOIN [dbo].[City]
ON [dbo].[Booking].[SourceCity] = [dbo].[City].[CityName]
AND [dbo].[Booking].[DestinationCity] = [dbo].[City].[CityName]
I guess you need a different JOIN
FROM [dbo].[Booking] as booking
LEFT OUTER JOIN [dbo].[City] as source_city
ON booking.[SourceCity] = source_city.[CityName]
LEFT OUTER JOIN [dbo].[City] as destination_city
ON booking.[DestinationCity] = destination_city.[CityName]
....
Yes it is possible, you just need to use a different table alias. Beyond referencing the same table twice, table aliases can make your code look a lot cleaner, e.g.
SELECT b.CustomerName,
sc.CityName AS SourceCity,
dc.CityName AS DestinationCity,
b.StartingDate,
b.EndingDate,
c.LicensePlateNumber,
d.DriverName,
b.AdvanceTaken,
b.PendingPayment,
b.TotalRent,
b.BookingDate,
b.IDProof
FROM dbo.Booking AS b
LEFT OUTER JOIN dbo.City AS sc
ON sc.CityName= b.SourceCity
LEFT OUTER JOIN dbo.City AS dc -- Different Alias here
ON dc.CityName = b.DestinationCity
LEFT OUTER JOIN dbo.Driver AS d
ON d.DriverID = b.DriverAllotted
LEFT OUTER JOIN dbo.Car AS c
ON c.CarID = b.CarAllotted
ORDER BY
b.BookingID;
I appreciate that cleaner is somewhat subjective, but I would be astonished if anyone found this harder to read than your original query

Inner Join and Left Join on 5 tables in Access using SQL

I am attempting to access data from the following tables:
OrgPlanYear
ProjOrgPlnYrJunction
DC
DCMaxEEContribLevel
DCNonDiscretionaryContribLevel
Basically, I need to inner join OrgPlanYear + DC and ProjOrgPlnYrJunction then I need to Left Join the remaining tables (tables 4 and 5) due to the fact the tables 1-3 have all the rows I need and only some have data in tables 4-5. I need several variables from each table. I also need the WHERE function to be across all fields (meaning I want all this data for a select group where projectID=919).
Please help!
I have tried many things with errors including attempting to use the Design Query side (i.e. JOIN function issues, badly formatted FROM function, etc.)! Here is an example of one excluding all variables I need:
SELECT
ProjOrgPlnYrJunction.fkeyProjectID, OrgPlanYear.OrgName, DC.PlanCode, DCNonDiscretionaryContribLevel.Age,DCNonDiscretionaryContribLevel.Service
FROM
(((OrgPlanYear INNER JOIN DC ON OrgPlanYear.OrgPlanYearID = DC.fkeyOrgPlanYearID) INNER JOIN ProjOrgPlnYrJunction ON OrgPlanYear.OrgPlanYearID = ProjOrgPlnYrJunction.fkeyOrgPlanYearID)
LEFT JOIN
(SELECT DCNonDiscretionaryContribLevel.Age AS Age, DCNonDiscretionaryContribLevel.Service AS Service FROM DCNonDiscretionaryContribLevel WHERE ProjOrgPlnYrJunction.fkeyProjectID)=919)
LEFT JOIN (
SELECT DCMaxEEContribLevel.EEContribRoth FROM EEContribRoth WHERE ProjOrgPlnYrJunction.fkeyProjectID)=919)
ORDER BY OrgPlanYear.OrgName;
Main issues with your query:
Missing ON clauses for each LEFT JOIN.
Referencing other table columns in SELECT and WHERE of a different subquery (e.g., FROM DCNonDiscretionaryContribLevel WHERE ProjOrgPlnYrJunction.fkeyProjectID).
Unmatched parentheses around subqueries and joins per Access SQL requirements.
See below adjusted SQL that now uses short table aliases. Be sure to adjust SELECT and ON clauses with appropriate columns.
SELECT p.fkeyProjectID, o.OrgName, DC.PlanCode, dcn.Age, dcn.Service, e.EEContribRoth
FROM (((OrgPlanYear o
INNER JOIN DC
ON o.OrgPlanYearID = DC.fkeyOrgPlanYearID)
INNER JOIN ProjOrgPlnYrJunction p
ON o.OrgPlanYearID = p.fkeyOrgPlanYearID)
LEFT JOIN
(SELECT Age AS Age, Service AS Service
FROM DCNonDiscretionaryContribLevel
WHERE fkeyProjectID = 919) AS dcn
ON dcn.fkeyProjectID = p.fkeyOrgPlanYearID)
LEFT JOIN
(SELECT EEContribRoth
FROM EEContribRoth
WHERE fkeyProjectID = 919) AS e
ON e.fkeyProjectID = p.fkeyProjectID
ORDER BY o.OrgName;

SQL: Proper JOIN Protocol

I have the following tables with the following attributes:
Op(OpNo, OpName, Date)
OpConvert(OpNo, M_OpNo, Source_ID, Date)
Source(Source_ID, Source_Name, Date)
Fleet(OpNo, S_No, Date)
I have the current multiple JOIN query which gives me the results that I want:
SELECT O.OpNo AS Op_NO, O.OpName, O.Date AS Date_Entered, C.*
FROM Op O
LEFT OUTER JOIN OpConvert C
ON O.OpNo = C.OpNo
LEFT OUTER JOIN Source D
ON C.Source_ID = D.Source_ID
WHERE C.OpNo IS NOT NULL
The problem is this. I need to join the Fleet table on the previous multiple JOIN statement to attach the relevant S_No to the multiple JOIN table. Would I still be able to accomplish this using a LEFT OUTER JOIN or would I have to use a different JOIN statement? Also, which table would I JOIN on?
Please note that I am only familiar with LEFT OUTER JOINS.
Thanks.
I guess in your case you could use INNER JOIN or LEFT JOIN (which is the same thing as LEFT OUTER JOIN in SQL Server.
INNER JOIN means that it will only return records from other tables only if there are corresponding records (based on the join condition) in the Fleet table.
LEFT JOIN means that it will return records from other tables even if there are no corresponding records (based on the join condition) in the Fleet table. All columns from Fleet will return NULL in this case.
As for which table to join, you should really join the table that makes more logical sense based on your data structure.
Yes, you can use all tables mentioned before in your join conditions. Actually, JOINS (no matter of INNER, LEFT OUTER, RIGHT OUTER, CROSS, FULL OUTER or whatever) are left- associative, i. e. they are implicitly evaluated as if they would have been included in parentheses from the left as follows:
FROM ( ( ( Op O
LEFT OUTER JOIN OpConvert C
ON O.OpNo = C.OpNo
)
LEFT OUTER JOIN Source D
ON C.Source_ID = D.Source_ID
)
LEFT OUTER JOIN Fleet
ON ...
)
This is similar to how + or - would implicitly use parentheses, i. e.
2 + 3 - 4 - 5
is evaluated as
(((2 + 3) - 4) - 5)
By the way: If you use C.OpNo IS NOT NULL, then the LEFT OUTER JOIN Source D is treated as if it were an INNER JOIN, as you are explicitly removing all the "OUTER" rows.

Sybase 12 LEFT JOIN Performance issue for CrystalReports

I am trying to optimize a Crystal Report that is used very frequently here. I succeeded to optimize lots of queries but I still have one last bottleneck: This is the main query, generated from the report.
SELECT
A.*,
B.*,
C.*,
D.*,
E."N",
F."N",
G."N"
FROM
A
LEFT OUTER JOIN B ON
A."PK" = B."FK"
LEFT OUTER JOIN C ON
A."PK" = C."FK"
LEFT OUTER JOIN D ON
A."FK" = D."PK"
LEFT OUTER JOIN E ON
A."PK" = E."FK"
LEFT OUTER JOIN F ON
A."PK" = F."FK"
LEFT OUTER JOIN G ON
A."PK" = G."FK"
WHERE A.PK = ####
A,B,C and D are tables. E,F,G are simple views.
As you see, the report generated multiple LEFT JOINS. This query takes 2.28 seconds to complete (From the Plan Viewer stats). I identified three joins that seem problematic. If I remove E,F,G from the query, it becomes almost instant (0.0009s from the same stats)
SELECT
A.*,
B.*,
C.*,
D.*
FROM
A
LEFT OUTER JOIN B ON
A."PK" = B."FK"
LEFT OUTER JOIN C ON
A."PK" = C."FK"
LEFT OUTER JOIN D ON
A."FK" = D."PK"
WHERE A.PK = ####
I tought it might be the views that are slow, but if I do for example ...
SELECT *
FROM E
WHERE E.FK = ####
... it is also almost instant (0.0009s)
Tables all have indexes on PKs-FKs.
Views E,F,G all return one or no row with [FK|N] as columns, so the resulting column is NULL or a number.
Do you know how I could make this query fast?
PS: If I replace LEFT OUTER JOINS by INNER JOINS the main query becomes fast... :-/
Or trying to split this query into multiple queries on the report would be a better solution?
Thank you!
I would create functions for the lookup against E, F and G instead of joining them.
That way there is little chance the optimiser gets confused and tries to do stupid things.
SELECT
A.*,
B.*,
C.*,
D.*,
GET_E(A."PK"),
GET_F(A."PK"),
GET_G(A."PK")
FROM
A
LEFT OUTER JOIN B ON
A."PK" = B."FK"
LEFT OUTER JOIN C ON
A."PK" = C."FK"
LEFT OUTER JOIN D ON
A."FK" = D."PK"
WHERE A.PK = ####
The problem is probably because you are creating a huge cartesian product of 5 tables all joined to A in some way (A and D will only contribute one record to the product). Having such a big cartesian product will consume quite a bit of memory internally in Sybase. It is likely that your query is just wrong.

Combining OUTER JOIN and WHERE

I'm trying to fetch some data from a database.
I want to select an employee, and if available, all appointments and other data related to that employee.
This is the query:
SELECT
TA.id,
TEI.displayname,
TA.threatment_id,
TTS.appointment_date,
TEI.displayname
FROM
tblemployee AS TE
LEFT OUTER Join tblappointment AS TA ON TE.employeeid = TA.employee_id
Inner Join tblthreatment AS T ON TA.threatment_id = T.threatmentid
Inner Join tblappointments AS TTS ON TTS.id = TA.appointments_id AND
TTS.appointment_date = '2009-09-28'
INNER Join tblemployeeinfo AS TEI ON TEI.employeeinfoid = TE.employeeinfoid
Inner Join tblcustomercard AS TCC ON TCC.customercardid = TTS.customercard_id
WHERE
TE.employeeid = 4
The problem is, it just returns null for all fields selected when there are no appointments. What am I not getting here?
Edit:
For clearity, i removed some of the collumns. I removed one too many. TEI.displayname should at least be displayed.
Looking at the list of columns returned by your query, you will notice that they all come from the "right" side of the LEFT OUTER JOIN. You do not include any columns from the "left" side of the join. Therefore, the expected result is the one you are observing — NULL values supplied for all right-hand columns in the result set for those rows that have no right-hand rows returned.
To see data even for those rows, include some columns from TE (tblemployee) in the result set.
Looking at your query I'm guessing that the situation is a bit more complex and that some of those tables on the right-hand side of the join should be moved to the left-hand side and, furthermore, that some of the other tables might possibly require their own OUTER joins to participate correctly in the query.
Edited w/ response to questioner's comment:
You have an odd situation (maybe not odd at all, depending on your application) in which you have an employee table and a separate employee information (employeeinfo) table.
Because you are joining the employeeinfo to the appointments table with an INNER join you can effectively think of them as a single table in terms of how they contribute to the final result set. Because this combined table REQUIRES a record in the appointments table and because this combined table is joined into the main result set with a LEFT OUTER join, the effect is that the employeeinfo record is not found if there's no appointment to link it to.
If you move the employeeinfo table to the left side of the join, or replace the employee table w/ the employeeinfo table, you should get the results you want.
In your query, you LEFT OUTER JOIN to the tblappointment table, but then you INNER JOIN to the tblthreatment and tblappointments tables.
You should try and structure your query in the order that you expect data to be there. Then in most simple queries, once you perform an OUTER join, most tables after that will be an OUTER join. This is by NO MEANS a rule and complex queries can vary, but in the marjority of simple queries its a good practice.
Try something like this for your query.
SELECT
TA.id,
TEI.displayname,
TA.threatment_id,
TTS.appointment_date
FROM
tblemployee AS TE
INNER Join
tblemployeeinfo AS TEI
ON
TEI.employeeinfoid = TE.employeeinfoid
LEFT OUTER Join
tblappointment AS TA
ON
TE.employeeid = TA.employee_id
LEFT OUTER JOIN
tblthreatment AS T
ON
TA.threatment_id = T.threatmentid
LEFT OUTER JOIN
tblappointments AS TTS
ON
TTS.id = TA.appointments_id
AND
TTS.appointment_date = '2009-09-28'
LEFT OUTER JOIN
tblcustomercard AS TCC
ON
TCC.customercardid = TTS.customercard_id
WHERE
TE.employeeid = 4
The issue is that the way you're joining (most of everything is joining to your left outer-joined table) whenever you're joining off of that, if the value in the outer joined table is nothing, there is nothing for the other fields to join to. Try to re-adjust your query so everything is joining off of your employeeID. I normally use left joined tables after I've limited everything down as much as possible with inner joins.
So my query would be something like:
SELECT
TA.id,
TEI.displayname,
TA.threatment_id,
TTS.appointment_date
FROM
tblemployee AS TE
INNER Join tblemployeeinfo AS TEI ON TEI.employeeinfoid = TE.employeeinfoid
Inner Join tblthreatment AS T ON TA.threatment_id = T.threatmentid
Inner Join tblappointments AS TTS ON TTS.id = TA.appointments_id AND
TTS.appointment_date = '2009-09-28'
Inner Join tblcustomercard AS TCC ON TCC.customercardid = TTS.customercard_id
LEFT OUTER Join tblappointment AS TA ON TE.employeeid = TA.employee_id
WHERE
TE.employeeid = 4
where the last outer join just gives me one column worth of information, not using it all to join more things onto. For speed, you also want to try to limit your information down as fast as possible with your first few inner joins, and then you do the outer joins last to join possible null values on to the smallest dataset you can. I hope this helps, if it's confusing, I'm sorry... I haven't had my caffeine yet.
The query is performing as it should.
A left out join will select all records from one table, join them with the records in another, and produce nulls where no records in the second table are found that match the join condition.
If you're looking for a separate behavior, you may want to think about two separate queries.