Linq to Entities eqiuvalent to a filtered inline view in T-SQL - sql

I have a query from a database I need to make. I understand how to write the query in T-SQL. The real query is much more complicated, but a simple illustration of the pattern is something like this:
SELECT * FROM [dbo].[A] AS a
LEFT JOIN dbo.[B] AS b ON a.ID = b.ParentID
LEFT JOIN dbo.[C] AS c ON y.ID = c.ParentID
LEFT JOIN
(
SELECT * FROM dbo.[D]
WHERE OtherID = #otherID
) AS d ON c.ID = d.ParentID
LEFT JOIN
(
SELECT * FROM dbo.[E]
WHERE OtherID = #otherID
) AS e ON e.ID = e.ParentID
WHERE A.ID = #Id
I need to write that SQL in C# linq to sql (for entityframework core) such that it generates the equivalent of the filtered inline views above. The goal obviously is to return a result set that always contains the tree A->B->C and contains D or E if and only if those nodes also match the secondary filtering. Note that it is quite easy to do the filtering within the inline view, but very difficult to do it outside the inline view because filtering outside the inline view tends to cause C nodes to disappear when there is no matching D child. That is not the intention.
Thanks
PS: To clarify, you might make a first attempt to write the above as:
query = from a in context.A
join bt in context.B on a.ID equals bt.ParentID into btent
from b in btent.DefaultIfEmpty()
join ct in context.C on b.ID equals ct.ParentID into ctent
from c in ctent.DefaultIfEmpty()
join dt in context.D on c.ID equals dt.ParentID into dtent
from d in dtent.DefaultIfEmpty()
.Include(a => a.B).ThenInclude(b => b.C).ThenInclude(c => c.D)
.Where(a => a.ID = myPrimaryID && d.OtherId = myOtherID)
The trouble is that a where clause on the 'd' entity returns only those rows where D entity exists, so the entire stack will be empty if it isn't. If you try to get cute and say filter where the 'd' entity is null or matches the filter, if you inspect the sql generated by EF in that situation, it is incorrect. The correct filtering has to happen within the 'join', as with the T-SQL above.
PPS: Yes, if you aren't filtering except for the parent object, you can dispense with this entirely and just write the includes and the where clause, but I think on reflection you'll realize that filtering by a term that applies to a great-grand-child but doesn't filter the grand-child is complex. If you can write out the query in either 'form', I'd appreciate it.

Apart from the lack of natural left outer join syntax, select being last, and select * requires anonymous/concrete type projection (but it could contain whole entities), LINQ supports the same constructs as standard SQL, including inline subqueries.
So it's possible to write LINQ query the exact way as sample SQL query:
from a in db.A
join b in db.B on a.ID equals b.ParentID
into a_b from b in a_b.DefaultIfEmpty()
join c in (from c in db.C where c.OtherID == myOtherID select c) on b.ID equals c.ParentID
into b_c from c in b_c.DefaultIfEmpty()
join d in (from d in db.D where d.OtherID == myOtherID2 select d) on c.ID equals d.ParentID
into c_d from d in c_d.DefaultIfEmpty()
select new { a, b, c, d }
which is translated by EF Core to:
SELECT [s].[ID], [s0].[ID], [s0].[ParentID], [t].[ID], [t].[OtherID], [t].[ParentID], [t0].[ID], [t0].[OtherID], [t0].[ParentID]
FROM [SO6_A] AS [s]
LEFT JOIN [SO6_B] AS [s0] ON [s].[ID] = [s0].[ParentID]
LEFT JOIN (
SELECT [s1].[ID], [s1].[OtherID], [s1].[ParentID]
FROM [SO6_C] AS [s1]
WHERE [s1].[OtherID] = #__myOtherID_0
) AS [t] ON [s0].[ID] = [t].[ParentID]
LEFT JOIN (
SELECT [s2].[ID], [s2].[OtherID], [s2].[ParentID]
FROM [SO6_D] AS [s2]
WHERE [s2].[OtherID] = #__myOtherID2_1
) AS [t0] ON [t].[ID] = [t0].[ParentID]
Another standard LINQ way is to push the predicates into join conditions (thus not filtering out the outer join result) by using composite join keys:
from a in db.A
join b in db.B on a.ID equals b.ParentID
into a_b from b in a_b.DefaultIfEmpty()
join c in db.C on new { K1 = b.ID, K2 = myOtherID } equals new { K1 = c.ParentID, K2 = c.OtherID }
into b_c from c in b_c.DefaultIfEmpty()
join d in db.D on new { K1 = c.ID, K2 = myOtherID2 } equals new { K1 = d.ParentID, K2 = d.OtherID }
into c_d from d in c_d.DefaultIfEmpty()
select new { a, b, c, d }
which is translated to:
SELECT [s].[ID], [s0].[ID], [s0].[ParentID], [s1].[ID], [s1].[OtherID], [s1].[ParentID], [s2].[ID], [s2].[OtherID], [s2].[ParentID]
FROM [SO6_A] AS [s]
LEFT JOIN [SO6_B] AS [s0] ON [s].[ID] = [s0].[ParentID]
LEFT JOIN [SO6_C] AS [s1] ON ([s0].[ID] = [s1].[ParentID]) AND (#__myOtherID_0 = [s1].[OtherID])
LEFT JOIN [SO6_D] AS [s2] ON ([s1].[ID] = [s2].[ParentID]) AND (#__myOtherID2_1 = [s2].[OtherID])
More compact LINQ way is to use correlated sub queries instead of joins:
from a in db.A
from b in db.B.Where(b => a.ID == b.ParentID).DefaultIfEmpty()
from c in db.C.Where(c => b.ID == c.ParentID && c.OtherID == myOtherID).DefaultIfEmpty()
from d in db.D.Where(d => c.ID == d.ParentID && d.OtherID == myOtherID2).DefaultIfEmpty()
select new { a, b, c, d }
which is happily translated by EF Core to:
SELECT [s].[ID], [s0].[ID], [s0].[ParentID], [t].[ID], [t].[OtherID], [t].[ParentID], [t0].[ID], [t0].[OtherID], [t0].[ParentID]
FROM [SO6_A] AS [s]
LEFT JOIN [SO6_B] AS [s0] ON [s].[ID] = [s0].[ParentID]
LEFT JOIN (
SELECT [s1].[ID], [s1].[OtherID], [s1].[ParentID]
FROM [SO6_C] AS [s1]
WHERE [s1].[OtherID] = #__myOtherID_0
) AS [t] ON [s0].[ID] = [t].[ParentID]
LEFT JOIN (
SELECT [s2].[ID], [s2].[OtherID], [s2].[ParentID]
FROM [SO6_D] AS [s2]
WHERE [s2].[OtherID] = #__myOtherID2_1
) AS [t0] ON [t].[ID] = [t0].[ParentID]
Finally, the most compact and preferred way in EF Core is to use navigation properties instead of manual joins in LINQ to Entities query:
from a in db.A
from b in a.Bs.DefaultIfEmpty()
from c in b.Cs.Where(c => c.OtherID == myOtherID).DefaultIfEmpty()
from d in c.Ds.Where(d => d.OtherID == myOtherID2).DefaultIfEmpty()
select new { a, b, c, d }
which is also translated by EF Core to:
SELECT [s].[ID], [s0].[ID], [s0].[ParentID], [t].[ID], [t].[OtherID], [t].[ParentID], [t0].[ID], [t0].[OtherID], [t0].[ParentID]
FROM [SO6_A] AS [s]
LEFT JOIN [SO6_B] AS [s0] ON [s].[ID] = [s0].[ParentID]
LEFT JOIN (
SELECT [s1].[ID], [s1].[OtherID], [s1].[ParentID]
FROM [SO6_C] AS [s1]
WHERE [s1].[OtherID] = #__myOtherID_0
) AS [t] ON [s0].[ID] = [t].[ParentID]
LEFT JOIN (
SELECT [s2].[ID], [s2].[OtherID], [s2].[ParentID]
FROM [SO6_D] AS [s2]
WHERE [s2].[OtherID] = #__myOtherID2_1
) AS [t0] ON [t].[ID] = [t0].[ParentID]

Fair enough. 99.9% percent of EF questions about translating LEFT JOIN are a simple failure to use Navigation Properties.
EF Core is adding filtered includes in the next version see Filtering on Include in EF Core.
Or you can project A, along with selected child collections something like this:
var q = from a in db.A
select new
{
a,
Bs = a.Bs,
Ds = a.Bs.SelectMany( b => b.Ds ).Where(d => d.OtherID = dOtherId)
};

Related

SQL Subquery containing Joins

I'm using Hive hql. I am trying to inner join two tables filtering on issue_type='Impediments'
Now I have a new requirement to join dm_jira__label to include the label and issue_id columns. I have tried having a subquery adding the issue_id and label by using a left join with dm_jira__label on issue_id
INNER JOIN datamart_core.dm_jira__release
ON dm_jira.issue_id = dm_jira__release.issue_id;
(
SELECT b.issue_id, b.label AS jira_label
FROM datamart_core.dm_jira__label as B, datamart_core.dm_jira__release AS K
LEFT JOIN b
ON b.issue_id=k.issue_id
);
WHERE dm_jira.issue_type = 'Impediment') AS J
I am getting the following error:
AnalysisException: Illegal table reference to non-collection type: 'b' Path resolved to type: STRUCT<issue_id:DOUBLE,label:STRING>
See the full code below. thanks in advance.
SELECT DISTINCT
j.project_key AS jira_project_key,
j.issue_type,
j.issue_assignee AS impediment_owner,
j.issue_status AS impediment_status,
j.issue_priority AS impediment_priority,
j.issue_summary AS impediment_summary,
j.`release` AS jira_release,
j.sow AS sow_num,
j.issue_due_date_utc AS jira_issue_due_date_utc,
j.issue_id AS jira_issue_id,
s.sow_family
from (
--Subquery to combine dm_jira and dm_jira__release
SELECT dm_jira.project_key,
dm_jira.issue_type,
dm_jira.issue_assignee,
dm_jira.issue_status,
dm_jira.issue_priority,
dm_jira.issue_summary,
dm_jira.issue_due_date_utc,
dm_jira.issue_id,
dm_jira__release.`release`,
dm_jira__release.sow
from datamart_core.dm_jira
INNER JOIN datamart_core.dm_jira__release
ON dm_jira.issue_id = dm_jira__release.issue_id;
(
SELECT b.issue_id, b.label AS jira_label
FROM datamart_core.dm_jira__label as B, datamart_core.dm_jira__release AS K
LEFT JOIN b
ON b.issue_id=k.issue_id
);
WHERE dm_jira.issue_type = 'Impediment') AS J
INNER JOIN datamart_core.dm_asoe_jira_scrum_summary AS S
ON j.`release` = s.jira_release
AND j.sow = s.sow_num
AND j.project_key = s.jira_project_key;
; ends a whole statement, don't use it at the end of sub-queries or joins.
Using meaningless aliases such as B or K or J harms readability, don't do it.
FROM x, y is the same as FROM x CROSS JOIN y, it's not a list of tables you're going to join. This means that you have the following code...
(
SELECT
b.issue_id, b.label AS jira_label
FROM
datamart_core.dm_jira__label as B
CROSS JOIN
datamart_core.dm_jira__release AS K
LEFT JOIN
b
ON b.issue_id=k.issue_id
)
The b in the LEFT JOIN isn't a table, and causes your syntax error.
Then, your sub query just sits in the middle of the code, it's not joined on or used in any way. I think you intended a pattern more like this...
FROM
datamart_core.dm_jira
INNER JOIN
datamart_core.dm_jira__release
ON dm_jira.issue_id = dm_jira__release.issue_id;
LEFT JOIN
(
<your sub-query>
)
AS fubar
ON fubar.something = something.else
WHERE
dm_jira.issue_type = 'Impediment'
Even then, you don't actually need nested sub-queries at all. You can just keep adding joins, such as this...
SELECT
jira.project_key AS jira_project_key,
jira.issue_type,
jira.issue_assignee AS impediment_owner,
jira.issue_status AS impediment_status,
jira.issue_priority AS impediment_priority,
jira.issue_summary AS impediment_summary,
jrel.`release` AS jira_release,
jrel.sow AS sow_num,
jira.issue_due_date_utc AS jira_issue_due_date_utc,
jira.issue_id AS jira_issue_id,
jlab.label,
summ.sow_family
FROM
datamart_core.dm_jira AS jira
INNER JOIN
datamart_core.dm_jira__release AS jrel
ON jrel.issue_id = jira.issue_id
LEFT JOIN
datamart_core.dm_jira__label AS jlab
ON jlab.issue_id = jrel.issue_id
INNER JOIN
datamart_core.dm_asoe_jira_scrum_summary AS summ
ON summ.jira_release = jrel.`release`
AND summ.sow_num = jrel.sow
AND summ.jira_project_key = jira.project_key
WHERE
jira.issue_type = 'Impediment'
;

Compare list of information with other Lists SQL

This is the database.
autorizacao_procedimento.tipo can be P or N
procedimento.tipo can be A or F
Rules
Each autorizacao has 4 lists:
A list of procedimento.tipo = A, that must be of procedimento.tipo = (1*)
A list of procedimento.tipo = A, that must not have procedimento.tipo = N (2*)
A list of procedimento.tipo = F, that must be of procedimento.tipo = P (1*)
A list of procedimento.tipo = F, that must not have procedimento.tipo = N (2*)
1 A solicitacao must have all procedimentos specified to be valid.
2* A solicitacao must not have any procedimentos to be valid*
Those relations are built-in table autorizacao_procedimento
The table solicitacao_procedimento makes the relations between solicitacao and procedimento.
What I need to do?
I need to find a list of autorizacao that match with the solicitacao by codigo following the rules.
This is one of my attempts:
SELECT *
FROM ( select a.id, p.id as procedimentoId, p.codigo, p.tipo as procedimentoTipo, ap.tipo as autorizacaoProcedimentoTipo
from autorizacao_2.autorizacao a
inner join autorizacao_2.autorizacao_procedimento ap on a.id = ap.autorizacao_id
inner join autorizacao_2.procedimento p on ap.procedimento_id = p.id where a.id = 12 order by a.id, p.tipo, ap.tipo) AS cf
left join autorizacao_2.solicitacao_procedimento sp on sp.procedimento_id = cf.procedimentoId
left join autorizacao_2.solicitacao s on sp.solicitacao_id = s.id
where s.id = '24'
Here I was trying to get all procedimentos of the autorizacao and compare it with the autorizacao in the solicitacao.
I'm doing this in a Springboot / SpringData Jpa application.
I created a repository custom to use criteria with native query.
This is how the data is:
select a.id as autorizacaoId
from autorizacao_2.autorizacao a
inner join autorizacao_2.autorizacao_procedimento ap on a.id = ap.autorizacao_id
where (ap.tipo = 'P'
and ap.procedimento_id not in (select sp.procedimento_id
from autorizacao_2.solicitacao s
inner join autorizacao_2.solicitacao_procedimento sp on s.id = sp.solicitacao_id
where s.id = 1941))
OR (ap.tipo = 'U' and ap.procedimento_id in (select sp.procedimento_id
from autorizacao_2.solicitacao s
inner join autorizacao_2.solicitacao_procedimento sp on s.id = sp.solicitacao_id
where s.id = 1941))
group by a.id;
I found this solution but I'm not sure about the performance.
This solution returns the autorizacao that doesn't match with the solicitacao.
Now, I'm trying to find a way to return only autorizacao that matches.

Select records where every record in one-to-many join matches a condition

How can I write a SQL query that returns records from table A only if every associated record from table B matches a condition?
I'm working in Ruby, and I can encode this logic for a simple collection like so:
array_of_A.select { |a| a.associated_bs.all? { |b| b.matches_condition? } }
I am being generic in the construction, because I'm working on a general tool that will be used across a number of distinct situations.
What I know to be the case is that INNER JOIN is the equivalent of
array_of_A.select { |a| a.associated_bs.any? { |b| b.matches_condition? } }
I have tried both:
SELECT DISTINCT "A".* FROM "A"
INNER JOIN "B"
ON "B"."a_id" = "A"."id"
WHERE "B"."string' = 'STRING'
as well as:
SELECT DISTINCT "A".* FROM "A"
INNER JOIN "B"
ON "B"."a_id" = "A"."id"
AND "B"."string' = 'STRING'
In both cases (as I expected), it returned records from table A if any associated record from B matched the condition. I'm sure there's a relatively simple solution, but my understanding of SQL just isn't providing it to me at the moment. And all of my searching thru SO and Google has proven fruitless.
I would suggest the following:
select distinct a.*
from a inner join
(
select b.a_id
from b
group by b.a_id
having min(b.string) = max(b.string) and min(b.string) = 'string'
) c on a.id = c.a_id
Alternatively:
select distinct a.*
from a inner join b on a.id = b.a_id
where not exists (select 1 from b c where c.a_id = a.id and c.string <> 'string')
Note: In the above examples, only change the symbols a and b to the names of your tables; the other identifiers are merely aliases and should not be changed.

Multiple association on same entity with hibernate Criteria

I would like to use this query in Hibernate criteria
SELECT a.a_id, b.b_id, b.b_description, sum(c1.c_score) AS score1, sum(c2.c_score) AS score2
FROM b, a, d, c
LEFT OUTER JOIN c AS c1 ON c1.c_id = c.c_id AND c1.c_comment = 'good'
LEFT OUTER JOIN c AS c2 ON c2.c_id = c.c_id AND c2.c_comment = 'nogood'
WHERE b.b_Id = d.d_id
AND d.d_id = c.c_id
AND c.c_foreignkey_a_id = a.a_id
GROUP BY a.a_id, b.b_id, b.b_description
My problem, in this case, is the 2 associations on the same entity ( C => C1 / C => C2).
And I want to know if there is a way to do this with criteria ?
Thanks ! - EZ
It's a serious limitation when using Hibernate Criteria (Now Deprecated).
You can try one the two "solutions" :
You can map the association twice ( in read only) ... not really clean
You can try to express you request with criteria subqueries. Example

JOIN and LEFT JOIN equivalent in LINQ

I am working with the following SQL query:
SELECT
a.AppointmentId,
a.Status,
a.Type,
a.Title,
b.Days,
d.Description,
e.FormId
FROM Appointment a (nolock)
LEFT JOIN AppointmentFormula b (nolock)
ON a.AppointmentId = b.AppointmentId and b.RowStatus = 1
JOIN Type d (nolock)
ON a.Type = d.TypeId
LEFT JOIN AppointmentForm e (nolock)
ON e.AppointmentId = a.AppointmentId
WHERE a.RowStatus = 1
AND a.Type = 1
ORDER BY a.Type
I am unsure how to achieve the JOINs in LINQ. All my tables have foreign key relationships.
SELECT A.X, B.Y
FROM A JOIN B ON A.X = B.Y
This linq method call (to Join) will generate the above Join.
var query = A.Join
(
B,
a => a.x,
b => b.y,
(a, b) => new {a.x, b.y} //if you want more columns - add them here.
);
SELECT A.X, B.Y
FROM A LEFT JOIN B ON A.X = B.Y
These linq method calls (to GroupJoin, SelectMany, DefaultIfEmpty) will produce the above Left Join
var query = A.GroupJoin
(
B,
a => a.x,
b => b.y,
(a, g) => new {a, g}
).SelectMany
(
z => z.g.DefaultIfEmpty(),
(z, b) =>
new { x = z.a.x, y = b.y } //if you want more columns - add them here.
);
The key concept here is that Linq's methods produce hierarchically shaped results, not flattened row-column shapes.
Linq's GroupBy produces results shaped in a hierarchy with a grouping key matched to a collection of elements (which may not be empty). SQL's GroupBy clause produces a grouping key with aggregated values - there is no sub-collection to work with.
Similarly, Linq's GroupJoin produces a hierarchical shape - a parent record matched to a collection of child records (which may be empty). Sql's LEFT JOIN produces a parent record matched to each child record, or a null child record if there are no other matches. To get to Sql's shape from Linq's shape, one must unpack the collection of child records with SelectMany - and deal with empty collections of child records using DefaultIfEmpty.
And here's my attempt at linquifying that sql in the question:
var query =
from a in Appointment
where a.RowStatus == 1
where a.Type == 1
from b in a.AppointmentFormula.Where(af => af.RowStatus == 1).DefaultIfEmpty()
from d in a.TypeRecord //a has a type column and is related to a table named type, disambiguate the names
from e in a.AppointmentForm.DefaultIfEmpty()
order by a.Type
select new { a.AppointmentId, a.Status, a.Type, a.Title, b.Days, d.Description, e.Form }
You may have to tweak this slightly as I was going off the cuff, but there are a couple of major things to keep in mind. If you have your relationships set up properly in your dbml, you should be able to do inner joins implicitly and just access the data through your initial table. Also, left joins in LINQ are not as straight forward as we may hope and you have to go through the DefaultIfEmpty syntax in order to make it happen. I created an anonymous type here, but you may want to put into a DTO class or something to that effect. I also didn't know what you wanted to do in the case of nulls, but you can use the ?? syntax to define a value to give the variable if the value is null. Let me know if you have additional questions...
var query = (from a in context.Appointment
join b in context.AppointmentFormula on a.AppointmentId equals b.AppointmentId into temp
from c in temp.DefaultIfEmpty()
join d in context.AppointmentForm on a.AppointmentID equals e.AppointmentID into temp2
from e in temp2.DefaultIfEmpty()
where a.RowStatus == 1 && c.RowStatus == 1 && a.Type == 1
select new {a.AppointmentId, a.Status, a.Type, a.Title, c.Days ?? 0, a.Type.Description, e.FormID ?? 0}).OrderBy(a.Type);
If you want to preserve the (NOLOCK) hints, I have blogged a handy solution using extension methods in C#. Note that this is the same as adding nolock hints to every table in the query.