SQL JOIN clause: substituting a bunch of flags with one Enum - grammar

I'm trying to implement a DSL containing some parts of SQL SELECT queries.
The JOIN syntax between two tables is specified (e.g. for PostgreSQL) like this:
// one of theese:
[ INNER ] JOIN
LEFT [ OUTER ] JOIN
RIGHT [ OUTER ] JOIN
FULL [ OUTER ] JOIN
CROSS JOIN
Note the optional keywords.
The following Xtext grammar works (sort of):
Join:
'INNER'? inner?='JOIN'
| left?='LEFT' 'OUTER'? 'JOIN'
| right?='RIGHT' 'OUTER'? 'JOIN'
| full?='FULL' 'OUTER'? 'JOIN'
| cross?='CROSS' 'JOIN'
;
The model inference will of course create a bunch of flags which cannot be handled nicely later.
What I really want is an enum like this:
enum JoinType: INNER_JOIN | LEFT_JOIN | RIGHT_JOIN | FULL_JOIN | CROSS_JOIN;
I want an enum because:
The generator et al. can use a simple switch statement.
The processing of the optional keywords and the embedded whitespace is grammar work.
Is there any reasonable way to connect that enum to the rest of the grammar?

You can define them individually, it may not be as elegant as enum although it is a way around;
Join:
(joins += JoinType)+ // or however you wish
;
JoinTypes:
INNER_JOIN | LEFT_JOIN | RIGHT_JOIN | FULL_JOIN | CROSS_JOIN
;
Then define each of them as well of you want.
INNER_JOIN:
// whatever you want, optional keywords etc.
;
LEFT_JOIN:
...

Related

KQL - Joining 2 tables sing Equality by Value

I am attempting to join two tables in KQL within Microsoft Defender.
These tables don't have matching columns however they do have matching fields.
LeftTable: EmailEvents Field: RecipientEmailAddress
RightTable: IdentityInfo Field: AccountUpn
The query I am using is as follows
EmailEvents
| where EmailDirection == "Inbound"
| where Subject == "invoice" or SenderFromAddress == "testtest#outlook.com"
| project RecipientEmailAddress, Subject, InternetMessageId, SenderFromAddress
| join kind=inner (IdentityInfo
| distinct AccountUpn, AccountDisplayName, JobTitle , Department, City, Country)
on $left.RecipientEmailAddress -- $right.AccountUpn
I am seeing the error
Semantic error
Error message
join: only column entities or equality expressions are allowed in this context.
How to resolve
Fix semantic errors in your query
Can someone assist I am not sure where I am going wrong here.
try replacing this:
on $left.RecipientEmailAddress -- $right.AccountUpn
with this:
on $left.RecipientEmailAddress == $right.AccountUpn

Why does Django QuerySet produce a query with 2 inner joins with AND clauses, instead or a single inner join with OR, when chaining filters?

I'm following the Django docs on making queries, and this example showed up:
Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)
I was hoping that the corresponding SQL query to this involved only one inner join and a OR clause, since the corresponding results are entries that either satisfy one or both of the conditions.
Nevertheless, this is what an inspection to the queryset's query returned:
>>> qs = Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)
>>> print(qs.query)
SELECT "blog_blog"."id", "blog_blog"."name", "blog_blog"."tagline"
FROM "blog_blog"
INNER JOIN "blog_entry" ON ("blog_blog"."id" = "blog_entry"."blog_id")
INNER JOIN "blog_entry" T3 ON ("blog_blog"."id" = T3."blog_id")
WHERE ("blog_entry"."headline" LIKE %Lennon% ESCAPE '\'
AND T3."pub_date" BETWEEN 2008-01-01 AND 2008-12-31)
You can see that it makes TWO inner joins.
Wouldn't the result be the same as:
SELECT "blog_blog"."id", "blog_blog"."name", "blog_blog"."tagline"
FROM "blog_blog"
INNER JOIN "blog_entry" ON ("blog_blog"."id" = "blog_entry"."blog_id")
WHERE ("blog_entry"."headline" LIKE %Lennon% ESCAPE '\' OR "blog_entry"."pub_date" BETWEEN 2008-01-01 AND 2008-12-31)
?
And this query is faster.
Well, after some research I've learned a few things.
First, from this answer, the way to implement an OR query is:
Blog.objects.filter(Q(entry__headline__contains='Lennon') | Q(entry__pub_date__year=2008))
And, from this fiddling:
https://www.db-fiddle.com/f/f8SGzTLeyr7DNZUaCx9HVL/0
I realized that the two queries in the OP don't get the same results: in the 2 inner joins, there is a cartesian product of results (2 "Lennon" * 2 "2008"), while in the second one there are 3 results (and it's indeed faster).
I assume, it is because you do filter() twice.
Try
qs = Blog.objects.filter(entry__headline__contains='Hello', entry__pub_date__year=2008)

How to Group_Concat with a 3-table JOIN for genealogy

I am failing to grasp how I can get the following outcome. I thought perhaps via GROUP_CONCAT, but I am also joining on 3 tables, and unclear on the correct syntax or if this is even the best approach.
Generic table layout:
Table Users: user_id | first | last
Table Orgs org_id | org_name
Table Relationship user_id | org_id | start_year | end_year
The relationship table has MANY entries, that may be associated with that specific user_id.
I need to get the User columns: id, first, last. I'd like to try and group the org data into 1 concatenated, delimited field. Maybe a double group_concatenation is needed? Which would consist of the org_id, org_name, start_year & end_year for all records in the relationship table that match the user_id. I'm hoping for an output like this:
Each '|' represents a new column/piece of data.
If there was only 1 org_id associated with the user_id, the output would be (similar) to:
user_id | first | last | org_id-org_name-start_year-end_year
If there were more than 1 org found/associated with that user_id, the output would have more concatenated/delimited data in the same column:
user_id | first | last | org_id-org_name-start_year-end_year^org_id-org_name-start_year-end_year^org_id-org_name-start_year-end_year
(Notice the '-' delimiter between values and the '^' delimiter between new 'org-grouped' data.)
When I grab that data, I can then just break it up (on the backend/PHP side of things) into an array or whatever.
I'm not sure how I can GROUP_CONCAT (if that is even the best approach here?) while I have to JOIN on 3 separate tables.
This is not my REAL query. (I'm not sure if I should post it, as I do not want to cause any confusion as it does NOT match my dummy table/column names.)
I just wanted to show my attempt that gets me 3 individual rows, (using my JOINS) but no GROUP_CONCAT stuff:
SELECT genealogy_users.imis_id, genealogy_users.full_name,
genealogy_users.member_email, genealogy_orgs.org_id,
genealogy_orgs.org_name, genealogy_relations.user_id,
genealogy_relations.relation_type, genealogy_relations.start_year,
genealogy_relations.end_year
FROM genealogy_users
INNER JOIN genealogy_relations ON genealogy_users.imis_id = genealogy_relations.user_id
INNER JOIN genealogy_orgs ON genealogy_relations.org_id = genealogy_orgs.org_id
WHERE genealogy_users.imis_id = '00003';
UPDATE:
Well I seemed to have fudged my way through it. But I'm not sure how legit this is.
Its -ALMOST- there. I believe I still need a JOIN or something? Since the genealogy_orgs.org_id = '84864' is hardcoded, and it should NOT be. Maybe it needs to come from a JOIN or something?
SELECT genealogy_users.*,
(SELECT GROUP_CONCAT(org_id,'-',
(SELECT org_name FROM genealogy_orgs WHERE genealogy_orgs.org_id = '84864'),
'-',start_year,'-',end_year,'^')
FROM genealogy_relations WHERE genealogy_relations.user_id = genealogy_users.imis_id
) AS alumni_list
FROM genealogy_users
WHERE genealogy_users.imis_id = '00003';
UPDATE 2:
My final attempt, which I think is getting me what I need. (But it's late, and I'll check back tomorrow and look at things more closely.)
SELECT genealogy_users.imis_id, genealogy_users.full_name,
genealogy_users.member_email, genealogy_orgs.org_id,
genealogy_orgs.org_name, genealogy_relations.user_id,
genealogy_relations.relation_type, genealogy_relations.start_year,
genealogy_relations.end_year,
(SELECT GROUP_CONCAT(org_id,'-',org_name,'-',start_year,'-',end_year,'^')
FROM genealogy_relations
WHERE genealogy_relations.user_id = genealogy_users.imis_id
) AS alumni_list
FROM genealogy_users
INNER JOIN genealogy_relations ON genealogy_users.imis_id = genealogy_relations.user_id
INNER JOIN genealogy_orgs ON genealogy_relations.org_id = genealogy_orgs.org_id
WHERE genealogy_users.imis_id = '00003';
Is there anything to make note of in the above attempt? Or is there a better approach? Hopefully something easily readable so it makes sense?

CASE Statment and IN() Operator in JOIN Condition

I'm trying to create a join that follows the following logic:
If our company is the Plaintiff, join to the following role types in
the table: Defense Firm, Defense Attorney
If our company is the Defendant, join to the following role types in
the table: Plaintiff Firm, Plaintiff Attorney
So far, I have this code written in the join, but it always produces an error for every syntax I've tried:
WHERE
TRIAL.TRIAL_ID = OPPOSITION.TRIAL_ID
AND OPPOSITION.ROLE IN
CASE
WHEN TRIAL.POSITION = 'Plaintiff'
THEN ('Defense Firm','Defense Attorney' )
WHEN TRIAL.POSITION = 'Defendant'
THEN ('Plaintiff Firm','Plaintiff Attorney')
END
We are currently running on Oracle (??)g.
Is this sort of join logic even possible?
EDITS:
Fixed the Defendant/Plaintiff mixup in the code section
Not sure what version of Oracle we're on.
You can use AND/OR logic, with appropriate parenthetical grouping, to achieve this (if I'm following what you need), something like:
WHERE TRIAL.TRIAL_ID = OPPOSITION.TRIAL_ID
AND (
(TRIAL.POSITION = 'Plaintiff'
AND OPPOSITION.ROLE IN ('Defense Firm', 'Defense Attorney')
OR
(TRIAL.POSITION = 'Defendant'
AND OPPOSITION.ROLE IN ('Plaintiff Firm', 'Plaintiff Attorney')
)
Although these look like they should be part of an ANSI JOIN clause, rather than a WHERE clause...

SQL: When and why are two on conditions allowed?

Question:
I recently had an interesting SQL problem.
I had to get the leasing contract for a leasing object.
The problem was, there could be multiple leasing contracts per room, and multiple leasing object per room.
However, because of bad db tinkering, leasing contracts are assigned to the room, not the leasing object. So I had to take the contract number, and compare it to the leasing object number, in order to get the right results.
I thought this would do:
SELECT *
FROM T_Room
LEFT JOIN T_MAP_Room_LeasingObject
ON MAP_RMLOBJ_RM_UID = T_Room.RM_UID
LEFT JON T_LeasingObject
ON LOBJ_UID = MAP_RMLOBJ_LOBJ_UID
LEFT JOIN T_MAP_Room_LeasingContract
ON T_MAP_Room_LeasingContract.MAP_RMCTR_RM_UID = T_Room.RM_UID
LEFT JOIN T_Contracts
ON T_Contracts.CTR_UID = T_MAP_Room_LeasingContract.MAP_RMCTR_CTR_UID
AND T_Contracts.CTR_No LIKE ( ISNULL(T_LeasingObject.LOBJ_No, '') + '.%' )
WHERE ...
However, because the mapping table gets joined before I have the contract number, and I cannot get the contract number without having the mapping table, i have doubled entries.
The problem is a little more complicated, as rooms having no leasing contract needed also to show up, so I couldn't just use an inner join.
With a little bit experimenting, I found that this works as expected:
SELECT *
FROM T_Room
LEFT JOIN T_MAP_Room_LeasingObject
ON MAP_RMLOBJ_RM_UID = T_Room.RM_UID
LEFT JON T_LeasingObject
ON LOBJ_UID = MAP_RMLOBJ_LOBJ_UID
LEFT JOIN T_MAP_Room_LeasingContract
LEFT JOIN T_Contracts
ON T_Contracts.CTR_UID = T_MAP_Room_LeasingContract.MAP_RMCTR_CTR_UID
ON T_MAP_Room_LeasingContract.MAP_RMCTR_RM_UID = T_Room.RM_UID
AND T_Contracts.CTR_No LIKE ( ISNULL(T_LeasingObject.LOBJ_No, '') + '.%' )
WHERE ...
I now see why the two on conditions in one join, which usually are courtesy of query designer, can be useful, and what difference it makes.
I was wondering whether this is a MS-SQL/T-SQL specific thing, or whether this is standard sql.
So I tried in PostgreSQL with another 3 tables.
So I wrote this query on 3 other tables:
SELECT *
FROM t_dms_navigation
LEFT JOIN t_dms_document
ON NAV_DOC_UID = DOC_UID
LEFT JOIN t_dms_project
ON PJ_UID = NAV_PJ_UID
and tried to turn it into one with two on conditions
SELECT *
FROM t_dms_navigation
LEFT JOIN t_dms_document
LEFT JOIN t_dms_project
ON PJ_UID = NAV_PJ_UID
ON NAV_DOC_UID = DOC_UID
So I thought it's t-sql specific, but quickly tried in MS-SQL too, just to find to my surprise that it doesn't work there either.
I thought it might be because of missing foreign keys, so i removed them on all tables in my room query, but it still did not work.
So here my question:
Why are 2 on conditions even legal, does this have a name, and why does it not work on my second example ?
It's standard SQL. Each JOIN has to have a corresponding ON clause. All you're doing is shifting around the order that the joins happen in1 - it's a bit like changing the bracketing of an expression to get around precedence rules.
A JOIN B ON <cond1> JOIN C ON <cond2>
First joins A and B based on cond1. It then takes that combined rowset and joins it to C based on cond2.
A JOIN B JOIN C ON <cond1> ON <cond2>
First joins B and C based on cond1. It then takes A and joins it to the previous combined rowset, based on cond2.
It should work in PostgreSQL - here's the relevant part of the documentation of the SELECT statement:
where from_item can be one of:
[ ONLY ] table_name [ * ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
( select ) [ AS ] alias [ ( column_alias [, ...] ) ]
with_query_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
function_name ( [ argument [, ...] ] ) [ AS ] alias [ ( column_alias [, ...] | column_definition [, ...] ) ]
function_name ( [ argument [, ...] ] ) AS ( column_definition [, ...] )
from_item [ NATURAL ] join_type from_item [ ON join_condition | USING ( join_column [, ...] ) ]
It's that last line that's relevant. Notice that it's a recursive definition - what can be to the left and right of a join can be anything - including more joins.
1As always with SQL, this is the logical processing order - the system is free to perform physical processing in whatever sequence it feels will work best, provided the result is consistent.