SQL Subquery containing Joins - sql

I'm using Hive hql. I am trying to inner join two tables filtering on issue_type='Impediments'
Now I have a new requirement to join dm_jira__label to include the label and issue_id columns. I have tried having a subquery adding the issue_id and label by using a left join with dm_jira__label on issue_id
INNER JOIN datamart_core.dm_jira__release
ON dm_jira.issue_id = dm_jira__release.issue_id;
(
SELECT b.issue_id, b.label AS jira_label
FROM datamart_core.dm_jira__label as B, datamart_core.dm_jira__release AS K
LEFT JOIN b
ON b.issue_id=k.issue_id
);
WHERE dm_jira.issue_type = 'Impediment') AS J
I am getting the following error:
AnalysisException: Illegal table reference to non-collection type: 'b' Path resolved to type: STRUCT<issue_id:DOUBLE,label:STRING>
See the full code below. thanks in advance.
SELECT DISTINCT
j.project_key AS jira_project_key,
j.issue_type,
j.issue_assignee AS impediment_owner,
j.issue_status AS impediment_status,
j.issue_priority AS impediment_priority,
j.issue_summary AS impediment_summary,
j.`release` AS jira_release,
j.sow AS sow_num,
j.issue_due_date_utc AS jira_issue_due_date_utc,
j.issue_id AS jira_issue_id,
s.sow_family
from (
--Subquery to combine dm_jira and dm_jira__release
SELECT dm_jira.project_key,
dm_jira.issue_type,
dm_jira.issue_assignee,
dm_jira.issue_status,
dm_jira.issue_priority,
dm_jira.issue_summary,
dm_jira.issue_due_date_utc,
dm_jira.issue_id,
dm_jira__release.`release`,
dm_jira__release.sow
from datamart_core.dm_jira
INNER JOIN datamart_core.dm_jira__release
ON dm_jira.issue_id = dm_jira__release.issue_id;
(
SELECT b.issue_id, b.label AS jira_label
FROM datamart_core.dm_jira__label as B, datamart_core.dm_jira__release AS K
LEFT JOIN b
ON b.issue_id=k.issue_id
);
WHERE dm_jira.issue_type = 'Impediment') AS J
INNER JOIN datamart_core.dm_asoe_jira_scrum_summary AS S
ON j.`release` = s.jira_release
AND j.sow = s.sow_num
AND j.project_key = s.jira_project_key;

; ends a whole statement, don't use it at the end of sub-queries or joins.
Using meaningless aliases such as B or K or J harms readability, don't do it.
FROM x, y is the same as FROM x CROSS JOIN y, it's not a list of tables you're going to join. This means that you have the following code...
(
SELECT
b.issue_id, b.label AS jira_label
FROM
datamart_core.dm_jira__label as B
CROSS JOIN
datamart_core.dm_jira__release AS K
LEFT JOIN
b
ON b.issue_id=k.issue_id
)
The b in the LEFT JOIN isn't a table, and causes your syntax error.
Then, your sub query just sits in the middle of the code, it's not joined on or used in any way. I think you intended a pattern more like this...
FROM
datamart_core.dm_jira
INNER JOIN
datamart_core.dm_jira__release
ON dm_jira.issue_id = dm_jira__release.issue_id;
LEFT JOIN
(
<your sub-query>
)
AS fubar
ON fubar.something = something.else
WHERE
dm_jira.issue_type = 'Impediment'
Even then, you don't actually need nested sub-queries at all. You can just keep adding joins, such as this...
SELECT
jira.project_key AS jira_project_key,
jira.issue_type,
jira.issue_assignee AS impediment_owner,
jira.issue_status AS impediment_status,
jira.issue_priority AS impediment_priority,
jira.issue_summary AS impediment_summary,
jrel.`release` AS jira_release,
jrel.sow AS sow_num,
jira.issue_due_date_utc AS jira_issue_due_date_utc,
jira.issue_id AS jira_issue_id,
jlab.label,
summ.sow_family
FROM
datamart_core.dm_jira AS jira
INNER JOIN
datamart_core.dm_jira__release AS jrel
ON jrel.issue_id = jira.issue_id
LEFT JOIN
datamart_core.dm_jira__label AS jlab
ON jlab.issue_id = jrel.issue_id
INNER JOIN
datamart_core.dm_asoe_jira_scrum_summary AS summ
ON summ.jira_release = jrel.`release`
AND summ.sow_num = jrel.sow
AND summ.jira_project_key = jira.project_key
WHERE
jira.issue_type = 'Impediment'
;

Related

Passing different column values to where clause

SELECT pims.icicimedicalexaminerreport.id,
pims.icicimerfemaleapplicant.adversemenstrualid,
pims.icicimerfemaleapplicant.pregnantid,
pims.icicimerfemaleapplicant.miscarriageabortionid,
pims.icicimerfemaleapplicant.breastdiseaseid,
pims.pimscase.tiannumber
FROM pims.pimscase
INNER JOIN pims.digitization
ON pims.pimscase.digitizationid = pims.digitization.id
INNER JOIN pims.medicalexaminerreport
ON pims.digitization.medicalexaminerreportid =
pims.medicalexaminerreport.id
INNER JOIN pims.icicimedicalexaminerreport
ON pims.medicalexaminerreport.id =
pims.icicimedicalexaminerreport.id
INNER JOIN pims.icicimerfemaleapplicant
ON pims.icicimedicalexaminerreport.id =
pims.icicimerfemaleapplicant.id
WHERE pims.pimscase.tiannumber = 'ICICI1234567890'
which gives me the following output
Now I want to use the above output values to select the rows from the table "YesNoAnswerWithObservation"
I imagine it should look something like this Select * from YesNoAnswerWithObservation Where Id in (22,27,26,...23)
Only instead of typing the values inside IN clause I want to use the values in each column resulting from above-mentioned query.
I tried the below code but it returns all the rows in the table rather than rows mentioned inside the In
SELECT pims.yesnoanswerwithobservation.observation,
graphitegtccore.yesnoquestion.description,
pims.yesnoanswerwithobservation.id ObservationId
FROM pims.yesnoanswerwithobservation
INNER JOIN graphitegtccore.yesnoquestion
ON pims.yesnoanswerwithobservation.yesnoanswerid =
graphitegtccore.yesnoquestion.id
WHERE EXISTS (SELECT pims.icicimedicalexaminerreport.id,
pims.icicimerfemaleapplicant.adversemenstrualid,
pims.icicimerfemaleapplicant.pregnantid,
pims.icicimerfemaleapplicant.pelvicorgandiseaseid,
pims.icicimerfemaleapplicant.miscarriageabortionid,
pims.icicimerfemaleapplicant.gynocologicalscanid,
pims.icicimerfemaleapplicant.breastdiseaseid,
pims.pimscase.tiannumber
FROM pims.pimscase
INNER JOIN pims.digitization
ON pims.pimscase.digitizationid =
pims.digitization.id
INNER JOIN pims.medicalexaminerreport
ON pims.digitization.medicalexaminerreportid =
pims.medicalexaminerreport.id
INNER JOIN pims.icicimedicalexaminerreport
ON pims.medicalexaminerreport.id =
pims.icicimedicalexaminerreport.id
INNER JOIN pims.icicimerfemaleapplicant
ON pims.icicimedicalexaminerreport.id =
pims.icicimerfemaleapplicant.id
WHERE pims.pimscase.tiannumber = 'ICICI1234567890')
Any help or a nudge in the right direction would be greatly appreciated
Presumably you want the ids from the first query:
SELECT awo.observation, ynq.description, ynq.id as ObservationId
FROM pims.yesnoanswerwithobservation awo JOIN
graphitegtccore.yesnoquestion ynq
ON awo.yesnoanswerid = ynq.id
WHERE ynq.id = (SELECT mer.id
FROM pims.pimscase c JOIN
pims.digitization d
ON c.digitizationid = d.id JOIN
pims.medicalexaminerreport mer
ON d.medicalexaminerreportid = mer.id JOIN
pims.icicimedicalexaminerreport imer
ON mer.id = imer.id JOIN
pims.icicimerfemaleapplicant ifa
ON imer.id = ifa.id
WHERE c.tiannumber = 'ICICI1234567890'
) ;
Notice that table aliases make the query much easier to write and to read.

Create Table Subquery for a join clause

So the problem here is OPTION_NAME in neo_product_benefit has no relation to the other tables, but it has a relationship to the neo_claims_pmb_details table via OPTION_ID column and in the neo_claims_pmb_details table it has a BENEFIT_ID column which can be joined to the other tables.
So in short
I'm not entirely sure how the SQL would look like to get the OPTION_NAME and join it to the other tables, so I thought creating a temporary table would work and joining it and then dropping it aftewards but I have no idea how the syntax would work
Any help would be appreciated.
SELECT a.batch_id,
a.claim_id,
a.cover_no,
a.receive_date,
a.practice_no,
a.service_provider_no,
a.refering_provider_no,
b.claim_line_id,
b.dependent_code,
b.service_date_from,
b.service_date_to,
b.cheque_run_date,
b.process_date,
b.tariff_code_no,
b.tariff_amount,
b.claimed_amount,
c.amount_paid,
d.practice_name,
e.discipline,
e.discipline_description,
g.rule_no,
g.message_code,
g.long_msg_description,
h.benefit_code,
h.benefit_description,
t.option_name
FROM neo_claims a
LEFT JOIN neo_claim_line b
ON (a.claim_id = b.claim_id)
LEFT JOIN neo_claim_line_benefit c
ON (b.claim_line_id = c.claim_line_id)
LEFT JOIN neo_practice_details d
ON ( a.practice_no = d.practice_no)
LEFT JOIN neo_sub_disciplines e
ON ( d.sub_discipline = e.sub_discipline)
LEFT JOIN neo_claimline_firings g
ON (b.claim_line_id = g.claim_line_id)
LEFT JOIN neo_product_benefit h
ON (c.benefit_id = h.benefit_id)
(
SELECT i.*,
j.*
INTO temp_table
FROM neo_claims_pmb_details j,
neo_product_optin i)
LEFT JOIN temp_table t
ON ( j.benefit_id = t.benefit_id)
WHERE a.batch_id = 3496584;
DROP TABLE temp_table;
You could join the two table adding the related (left or inner) join .. in my sample the tables have alias tx and ty
and based on you comment for a create table
CREATE TABLE temp_table AS
SELECT a.batch_id,
a.claim_id,
a.cover_no,
a.receive_date,
a.practice_no,
a.service_provider_no,
a.refering_provider_no,
b.claim_line_id,
b.dependent_code,
b.service_date_from,
b.service_date_to,
b.cheque_run_date,
b.process_date,
b.tariff_code_no,
b.tariff_amount,
b.claimed_amount,
c.amount_paid,
d.practice_name,
e.discipline,
e.discipline_description,
g.rule_no,
g.message_code,
g.long_msg_description,
h.benefit_code,
h.benefit_description,
t.option_name,
ty.OPTION_NAME
FROM neo_claims a
LEFT JOIN neo_claim_line b ON (a.claim_id = b.claim_id)
LEFT JOIN neo_claim_line_benefit c ON (b.claim_line_id = c.claim_line_id)
LEFT JOIN neo_practice_details d ON ( a.practice_no = d.practice_no)
LEFT JOIN neo_sub_disciplines e ON ( d.sub_discipline = e.sub_discipline)
LEFT JOIN neo_claimline_firings g ON (b.claim_line_id = g.claim_line_id)
LEFT JOIN neo_product_benefit h ON (c.benefit_id = h.benefit_id)
LEFT JOIN neo_claims_pmb_details tx ON tx.BENEFIT_ID = h.benefit_id
LEFT JOIN neo_product_benefit ty ON tx.OPTION_ID = ty.OPTION_ID
LEFT JOIN temp_table t ON ( j.benefit_id = t.benefit_id)
WHERE a.batch_id = 3496584;
My guess is no temp table needed. Just include table expression into the original query
...
LEFT JOIN neo_product_benefit h
ON (c.benefit_id = h.benefit_id)
LEFT JOIN ( SELECT i.benefit_id, j.option_name -- correct cols as needed
FROM neo_claims_pmb_details j,
neo_product_optin i) t
ON (c.benefit_id = t.benefit_id)
WHERE a.batch_id = 3496584;

Recursive query with outer joins?

I'm attempting the following query,
DECLARE #EntityType varchar(25)
SET #EntityType = 'Accessory';
WITH Entities (
E_ID, E_Type,
P_ID, P_Name, P_DataType, P_Required, P_OnlyOne,
PV_ID, PV_Value, PV_EntityID, PV_ValueEntityID,
PV_UnitValueID, PV_UnitID, PV_UnitName, PV_UnitDesc, PV_MeasureID, PV_MeasureName, PV_UnitValue,
PV_SelectionID, PV_DropDownID, PV_DropDownName, PV_DropDownOptionID, PV_DropDownOptionName, PV_DropDownOptionDesc,
RecursiveLevel
)
AS
(
-- Original Query
SELECT dbo.Entity.ID AS E_ID, dbo.EntityType.Name AS E_Type,
dbo.Property.ID AS P_ID, dbo.Property.Name AS P_Name, DataType.Name AS P_DataType, Required AS P_Required, OnlyOne AS P_OnlyOne,
dbo.PropertyValue.ID AS PV_ID, dbo.PropertyValue.Value AS PV_Value, dbo.PropertyValue.EntityID AS PV_EntityID, dbo.PropertyValue.ValueEntityID AS PV_ValueEntityID,
dbo.UnitValue.ID AS PV_UnitValueID, dbo.UnitOfMeasure.ID AS PV_UnitID, dbo.UnitOfMeasure.Name AS PV_UnitName, dbo.UnitOfMeasure.Description AS PV_UnitDesc, dbo.Measure.ID AS PV_MeasureID, dbo.Measure.Name AS PV_MeasureName, dbo.UnitValue.UnitValue AS PV_UnitValue,
dbo.DropDownSelection.ID AS PV_SelectionID, dbo.DropDown.ID AS PV_DropDownID, dbo.DropDown.Name AS PV_DropDownName, dbo.DropDownOption.ID AS PV_DropDownOptionID, dbo.DropDownOption.Name AS PV_DropDownOptionName, dbo.DropDownOption.Description AS PV_DropDownOptionDesc,
0 AS RecursiveLevel
FROM dbo.Entity
INNER JOIN dbo.EntityType ON dbo.EntityType.ID = dbo.Entity.TypeID
INNER JOIN dbo.Property ON dbo.Property.EntityTypeID = dbo.Entity.TypeID
INNER JOIN dbo.PropertyValue ON dbo.Property.ID = dbo.PropertyValue.PropertyID AND dbo.PropertyValue.EntityID = dbo.Entity.ID
INNER JOIN dbo.DataType ON dbo.DataType.ID = dbo.Property.DataTypeID
LEFT JOIN dbo.UnitValue ON dbo.UnitValue.ID = dbo.PropertyValue.UnitValueID
LEFT JOIN dbo.UnitOfMeasure ON dbo.UnitOfMeasure.ID = dbo.UnitValue.UnitOfMeasureID
LEFT JOIN dbo.Measure ON dbo.Measure.ID = dbo.UnitOfMeasure.MeasureID
LEFT JOIN dbo.DropDownSelection ON dbo.DropDownSelection.ID = dbo.PropertyValue.DropDownSelectedID
LEFT JOIN dbo.DropDownOption ON dbo.DropDownOption.ID = dbo.DropDownSelection.SelectedOptionID
LEFT JOIN dbo.DropDown ON dbo.DropDown.ID = dbo.DropDownSelection.DropDownID
WHERE dbo.EntityType.Name = #EntityType
UNION ALL
-- Recursive Query?
SELECT E2.E_ID AS E_ID, dbo.EntityType.Name AS E_Type,
dbo.Property.ID AS P_ID, dbo.Property.Name AS P_Name, DataType.Name AS P_DataType, Required AS P_Required, OnlyOne AS P_OnlyOne,
dbo.PropertyValue.ID AS PV_ID, dbo.PropertyValue.Value AS PV_Value, dbo.PropertyValue.EntityID AS PV_EntityID, dbo.PropertyValue.ValueEntityID AS PV_ValueEntityID,
dbo.UnitValue.ID AS PV_UnitValueID, dbo.UnitOfMeasure.ID AS PV_UnitID, dbo.UnitOfMeasure.Name AS PV_UnitName, dbo.UnitOfMeasure.Description AS PV_UnitDesc, dbo.Measure.ID AS PV_MeasureID, dbo.Measure.Name AS PV_MeasureName, dbo.UnitValue.UnitValue AS PV_UnitValue,
dbo.DropDownSelection.ID AS PV_SelectionID, dbo.DropDown.ID AS PV_DropDownID, dbo.DropDown.Name AS PV_DropDownName, dbo.DropDownOption.ID AS PV_DropDownOptionID, dbo.DropDownOption.Name AS PV_DropDownOptionName, dbo.DropDownOption.Description AS PV_DropDownOptionDesc,
(RecursiveLevel + 1)
FROM Entities AS E2
INNER JOIN dbo.Entity ON dbo.Entity.ID = E2.PV_ValueEntityID
INNER JOIN dbo.EntityType ON dbo.EntityType.ID = dbo.Entity.TypeID
INNER JOIN dbo.Property ON dbo.Property.EntityTypeID = dbo.Entity.TypeID
INNER JOIN dbo.PropertyValue ON dbo.Property.ID = dbo.PropertyValue.PropertyID AND dbo.PropertyValue.EntityID = E2.E_ID
INNER JOIN dbo.DataType ON dbo.DataType.ID = dbo.Property.DataTypeID
INNER JOIN dbo.UnitValue ON dbo.UnitValue.ID = dbo.PropertyValue.UnitValueID
INNER JOIN dbo.UnitOfMeasure ON dbo.UnitOfMeasure.ID = dbo.UnitValue.UnitOfMeasureID
INNER JOIN dbo.Measure ON dbo.Measure.ID = dbo.UnitOfMeasure.MeasureID
INNER JOIN dbo.DropDownSelection ON dbo.DropDownSelection.ID = dbo.PropertyValue.DropDownSelectedID
INNER JOIN dbo.DropDownOption ON dbo.DropDownOption.ID = dbo.DropDownSelection.SelectedOptionID
INNER JOIN dbo.DropDown ON dbo.DropDown.ID = dbo.DropDownSelection.DropDownID
)
SELECT E_ID, E_Type,
P_ID, P_Name, P_DataType, P_Required, P_OnlyOne,
PV_ID, PV_Value, PV_EntityID, PV_ValueEntityID,
PV_UnitValueID, PV_UnitID, PV_UnitName, PV_UnitDesc, PV_MeasureID, PV_MeasureName, PV_UnitValue,
PV_SelectionID, PV_DropDownID, PV_DropDownName, PV_DropDownOptionID, PV_DropDownOptionName, PV_DropDownOptionDesc,
RecursiveLevel
FROM Entities
INNER JOIN [dbo].[Entity] AS dE
ON dE.ID = PV_EntityID
The problem is the second query, the "recursive one" is getting the data I expect since I can't do the LEFT JOINs like in the first query. (At least to my understanding).
If I remove the fetching of the data that requires the LEFT (Outer) JOINs then the recursion works perfectly. My problem is I need both. Is there a way I can accomplish this?
Per http://msdn.microsoft.com/en-us/library/ms175972.aspx you can not have a left/right/outer join in a recursive CTE.
For a recursive CTE you can't use a subquery either so I sugest following this example.
They use two CTE's. The first is not recursive and does the left join to get the data it needs. The second CTE is recursive and inner joins on the first CTE. Since CTE1 is not recursive it can left join and supply default values for the missing rows and is guarenteed to work in the inner join.
However, you can also duplicate a left join with a union and subselect though it isn't really useful normally but it is interesting.
In that case, you would keep your first statement how it is. It will match all rows that join successfully.
Then UNION that query with another query that removes the join, but has a
NOT EXISTS(SELECT 1 FROM MISSING_ROWS_TABLE WHERE MAIN_TABLE.JOIN_CONDITION = MISSING_ROWS_TABLE.JOIN_CONDITION)
This gets all the rows that failed the previous join condition in query 1. You can replace the colmuns you would get from MISSING_ROWS_TABLE with NULL. I had to do this once using a coding framework that didn't support outer joins. Since recursive CTE's don't allow subqueries you have to use the first solution.

How to write subquery inside the OUTER JOIN Statement

I want to join two table CUSTMR and DEPRMNT.
My needed is: LEFT OUTER JOIN OF two or more Tables with subquery inside the LEFT OUTER JOIN as shown below:
Table: CUSTMR , DEPRMNT
Query as:
SELECT
cs.CUSID
,dp.DEPID
FROM
CUSTMR cs
LEFT OUTER JOIN (
SELECT
dp.DEPID
,dp.DEPNAME
FROM
DEPRMNT dp
WHERE
dp.DEPADDRESS = 'TOKYO'
)
ON (
dp.DEPID = cs.CUSID
AND cs.CUSTNAME = dp.DEPNAME
)
WHERE
cs.CUSID != ''
Here the subquery is:
SELECT
dp.DEPID, dp.DEPNAME
FROM
DEPRMNT dp
WHERE
dp.DEPADDRESS = 'TOKYO'
Is it possible to write such subquery inside LEFT OUTER JOIN?
I am getting an error when running this query on my DB2 database.
You need the "correlation id" (the "AS SS" thingy) on the sub-select to reference the fields in the "ON" condition. The id's assigned inside the sub select are not usable in the join.
SELECT
cs.CUSID
,dp.DEPID
FROM
CUSTMR cs
LEFT OUTER JOIN (
SELECT
DEPID
,DEPNAME
FROM
DEPRMNT
WHERE
dp.DEPADDRESS = 'TOKYO'
) ss
ON (
ss.DEPID = cs.CUSID
AND ss.DEPNAME = cs.CUSTNAME
)
WHERE
cs.CUSID != ''
I think you don't have to use sub query in this scenario.You can directly left outer join the DEPRMNT table .
While using Left Outer Join ,don't use columns in the RHS table of the join in the where condition, you ll get wrong output

Joining two tables on a key and then left outer joining a table on a number of criteria

I'm attempting to join 3 tables together in a single query. The first two have a key so each entry has a matching entry. This joined table will then be joined by a third table that could produce multiple entries for each entry from the first table (the joined ones).
select * from
(select a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession
from trade_monthly a, trade_monthly_second b
where
a.bidentifier = b.jidentifier AND
a.bsession = b.JSession)
left outer join
trade c
on c.symbol = a.symbol
order by a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.symbol
There will be more criteria (not just c.symbol = a.symbol) on the left outer join but for now this should be useful. How can I nest the queries this way? I'm gettin gan SQL command not properly ended error.
Any help is appreciated.
Thanks
For what I know every derived table must be given a name; so try something like this:
SELECT * FROM
(SELECT a.bidentifier, ....
...
a.bsession = b.JSession) t
LEFT JOIN trade c
ON c.symbol = t.symbol
ORDER BY t.bidentifier, ...
Anyway I think you could use a simpler query:
SELECT a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.*
FROM trade_monthly a
INNER JOIN trade_monthly_second b
ON a.bidentifier = b.jidentifier
AND a.bsession = b.JSession
LEFT JOIN trade c
ON c.symbol = a.symbol
ORDER BY a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.symbol
Try this:
SELECT
`trade_monthly`.`bidentifier` AS `bidentifier`,
`trade_monthly`.`bsession` AS `bsession`,
`trade_monthly`.`symbol` AS `symbol`,
`trade_monthly_second`.`jidentifier` AS `jidentifier`,
`trade_monthly_second`.`jsession` AS `jsession`
FROM
(
(
`trade_monthly`
JOIN `trade_monthly_second` ON(
(
(
`trade_monthly`.`bidentifier` = `trade_monthly_second`.`jidentifier`
)
AND(
`trade_monthly`.`bsession` = `trade_monthly_second`.`jsession`
)
)
)
)
JOIN `trade` ON(
(
`trade`.`symbol` = `trade_monthly`.`symbol`
)
)
)
ORDER BY
`trade_monthly`.`bidentifier`,
`trade_monthly`.`bsession`,
`trade_monthly`.`symbol`,
`trade_monthly_second`.`jidentifier`,
`trade_monthly_second`.`jsession`,
`trade`.`symbol`
Why don't you just create a view of the two inner joined tables. Then you can build a query that joins this view to the trade table using the left outer join matching criteria.
In my opinion, views are one of the most overlooked solutions to a lot of complex queries.