Query returning duplicate rows - sql

I wrote this query
SELECT DISTINCT
F2_FILIAL, F2_SERIE, F2_DOC,
C6_NUM, AB7_NUMOS, A1_NOME, F2_EMISSAO, F2_VALBRUT, F2_VEND1,
A3_NOME,F2_COND , E4_DESCRI, C5_NATUREZ, ED_DESCRIC, AAG_DESCRI
FROM
SF2010 SF
LEFT JOIN
SE4010 SE ON F2_COND = E4_CODIGO
LEFT JOIN
SA3010 A3 ON F2_VEND1 = A3_COD
LEFT JOIN
SA1010 A1 ON F2_CLIENTE = A1_COD
LEFT JOIN
SD2010 SD ON F2_DOC = D2_DOC
LEFT JOIN
SC6010 C6 ON D2_PEDIDO = C6_NUM
LEFT JOIN
SC5010 C5 ON D2_PEDIDO = C5_NUM
LEFT JOIN
SED010 ED ON C5_NATUREZ = ED_CODIGO
LEFT JOIN
AB7010 AB ON SUBSTRING(C6_NUMOS,1,6) = AB7_NUMOS
LEFT JOIN
AAG010 AG ON AB7_CODPRB = AAG_CODPRB
WHERE
(F2_CLIENTE >= ' '
AND F2_CLIENTE <= 'zzzzzz')
AND (F2_EMISSAO >= '20170222'
AND F2_EMISSAO <= '20170222')
AND (F2_VEND1 >= ''
AND F2_VEND1 <= 'zz')
AND (C5_NATUREZ >= ''
AND C5_NATUREZ <= 'zzzzzzzzzz')
AND (F2_COND >= ''
AND F2_COND <= 'zzz')
AND (F2_FILIAL >= ''
AND F2_FILIAL <= 'zz')
AND (SF.D_E_L_E_T_ <> '*')
AND F2_DUPL <> ''
AND F2_VALFAT <> 0
ORDER BY
F2_VEND1, F2_EMISSAO
And it results in something like this:
Notice that the 2 last rows are the same (the main field here is F2_DOC, it should never appear twice), but since the field C6_NUM and AB7_NUMOS has more than one reference it displays both of them, duplicating the row.
How can I improve my query to not duplicate a row when the table I'm joining has more than 1 distinct FK to the table I'm querying?

If you are doing multiple left joins and the leftest table in the join doesn't have ( or not mentioned in select ) a unique value, it's possible you get duplicate rows. if you need to have a unique column, use the primary key in the leftest table inside the select statement or add a column like
row_number() as id
to your query.

Related

SQL Query to apply join on basis of case Condition

I have a requirement where I need to fetch the Dimension Key of Region table on basis of the following preference.
Fetch dimension key on basis of Zipcode of Physical address(PA)
If the first condition is not satisfied that fetch dimension key on basis of the Zip Code of the Mailing address
If the second condition is also not satisfied than fetch the dimension key on basis of the Parish Code of Physical address
Else fetch dimension key on basis of parish Code of Mailing address.
I am trying to use the below query but is giving multiple records since all left joins are getting evaluated. I want that it should not go on the second condition if the first condition is satisfied.
select REGION_DIM_SK, CASE_NUM
from (
select distinct COALESCE(RDIM.REGION_DIM_SK, RDIM1.REGION_DIM_SK, RDIM2.REGION_DIM_SK, RDIM3.REGION_DIM_SK) AS REGION_DIM_SK
, DC.CASE_NUM, ADDR_TYPE_CD
FROM rpt_dm_ee_intg.CASE_PERSON_ADDRESS dc
left join rpt_dm_ee_prsnt.REGION_DIM RDIM on dc.ZIP_CODE = RDIM.ZIP_CODE and RDIM.REGION_EFF_END_DT IS NULL and dc.addr_type_cd='PA' AND dc.EFF_END_DT IS NULL
left join rpt_dm_ee_prsnt.REGION_DIM RDIM1 ON dc.ZIP_CODE = RDIM1.ZIP_CODE AND RDIM1.REGION_EFF_END_DT IS NULL AND dc.addr_type_cd='MA' AND DC.EFF_END_DT IS NULL
left join (
select PARISH_CD, min(REGION_DIM_SK) as REGION_DIM_SK
from rpt_dm_ee_prsnt.REGION_DIM
where REGION_EFF_END_DT is null
group by PARISH_CD
) RDIM2 ON dc.addr_type_cd='PA' and dc.PARISH_CD = RDIM2.PARISH_CD AND DC.EFF_END_DT IS NULL
left join (
select PARISH_CD, min(REGION_DIM_SK) as REGION_DIM_SK
from rpt_dm_ee_prsnt.REGION_DIM
where REGION_EFF_END_DT is null
group by PARISH_CD
) RDIM3 ON dc.addr_type_cd='MA' and dc.PARISH_CD = RDIM3.PARISH_CD AND DC.EFF_END_DT IS NULL
) A
where REGION_DIM_SK is not null
) RD on RD.case_num = rpt_dm_ee_intg.CASE_PERSON_ELIGIBILITY.CASE_NUM
Use multiple left joins. Your query is rather hard to follow -- it has other tables and references not described in the problem.
But the idea is:
select t.*,
coalesce(rpa.dim_key, rm.dim_key, rpap.dim_key, rmp.dim_key) as dim_key
from t left join
dim_region rpa
on t.physical_address_zipcode = rpa.zipcode left join
dim_region rm
on t.mailing_address_zipcode = rm.zipcode and
rpa.zipcode is null left join
dim_region rpap
on t.physical_addresss_parishcode = rpap.parishcode and
rm.zipcode is null left join
dim_region rmp
on t.physical_addresss_parishcode = rmp.parishcode and
rpap.zipcode is null
The trick is to put the conditions in CASE WHEN:
SELECT *
FROM table1 a
JOIN table2 b
ON CASE
WHEN a.code is not null and a.code = b.code THEN 1
WHEN a.type = b.type THEN 1
ELSE 0
END = 1
For your example you can reduce the code to just two joins, it can't be done in one as you are joining two different tables.
SELECT CASE WHEN RDIM.addres IS NULL THEN RDIM2.addres ELSE RDIM.addres
FROM rpt_dm_ee_intg.CASE_PERSON_ADDRESS dc
LEFT JOIN rpt_dm_ee_prsnt.REGION_DIM RDIM ON CASE
WHEN (dc.ZIP_CODE = RDIM.ZIP_CODE
AND RDIM.REGION_EFF_END_DT IS NULL
AND dc.addr_type_cd='PA'
AND dc.EFF_END_DT IS NULL) THEN 1
WHEN (dc.ZIP_CODE = RDIM1.ZIP_CODE
AND RDIM1.REGION_EFF_END_DT IS NULL
AND dc.addr_type_cd='MA'
AND DC.EFF_END_DT IS NULL) THEN 1
ELSE 0
END = 1
LEFT JOIN
(SELECT PARISH_CD,
min(REGION_DIM_SK) AS REGION_DIM_SK
FROM rpt_dm_ee_prsnt.REGION_DIM
WHERE REGION_EFF_END_DT IS NULL
GROUP BY PARISH_CD) RDIM2 ON CASE
WHEN (dc.addr_type_cd='PA'
AND dc.PARISH_CD = RDIM2.PARISH_CD
AND DC.EFF_END_DT IS NULL
AND RDIM.ZIP_CODE IS NULL) THEN 1
WHEN (dc.addr_type_cd='MA'
AND dc.PARISH_CD = RDIM3.PARISH_CD
AND DC.EFF_END_DT IS NULL
AND RDIM.ZIP_CODE IS NULL) THEN 1
ELSE 0
END = 1
edit
If you don't want to have nulls from RDIM2 table if RDIM1 zip code is present the logic could be easily extended to support that. You just need to add AND RDIM.ZIP_CODE IS NULL to CASE WHEN conditions.

T-SQL Full Outer Join (sample provided)

I have some issues getting a full outer join to work in T-SQL. The left outer join part seems to be working okay, but the right outer join doesn't work as expected. Here's some sample data to test this on:
I have a table A with the following columns and data. The row marked with red is the row that cannot be found in table B.
And a second table B with the following columns and data. The rows marked with yellow are the rows that cannot be found in table A.
I am trying to join the tables using the following sql code:
select tableA.klientnr, tableA.uttakstype, tableA.uttaksnr, tableA.vareanr TableAItem,tableB.vareanr tableBitem, tableA.kvantum tableAquantity, tableB.totkvant tableBquantity
from tableA as tableA
full outer join tableB as tableB on tableA.klientnr=tableB.klientnr and tableA.uttakstype=tableB.uttakstype and tableA.uttaksnr=tableB.uttaksnr and tableA.vareanr=tableB.vareanr and tableB.IsDeleted=0
where tableA.UttaksNr=639779 and tableA.IsDeleted=0
The result of the sql is the following image. The row marked in red is the extra row from tableA that does show up, but I can't get the rows from table B to show up
Expected to have 2 extra rows
550 SA 639779 NULL 100059 NULL 0
550 SA 639779 NULL 103040 NULL 14
Later edit:
Would this be correct way to handle the full outer join where there's the header/line type of structure? Or can the query be optimized?
SELECT ISNULL(q1.accountid, q2.accountid) AccountId
,ISNULL(q1.klientnr, q2.klientnr) KlientNr
,ISNULL(q1.tilgangstype, q2.tilgangstype) 'Reception Type'
,ISNULL(q1.tilgangsnr, q2.tilgangsnr) 'Reception No'
,ISNULL(q1.dato, q2.dato) dato
,ISNULL(q1.LevNr, q2.LevNr) LevNr
,ISNULL(q1.Pakkemerke, q2.Pakkemerke) Pakkemerke
,ISNULL(q1.VareANr, q2.VareANr) VareANr
,ISNULL(q1.Ankomstdato,q2.Ankomstdato) 'Arrival Date'
,q1.Antall1
,q1.totkvant1
,q1.Antall2
,q1.totkvant2
,q2.Antall
,q2.totkvant
,q2.AntallTilFrys
,q2.TotKvantTilFrys
,ISNULL(q1.EksternKommentar1,q2.EksternKommentar1) EksternKommentar1
,q2.[Last Upsert]
FROM (
SELECT w700.accountid
,w700.klientnr
,w700.tilgangstype
,w700.tilgangsnr
,w700.dato
,w700.Ankomstdato
,w700.LevNr
,w700.pakkemerke
,w789.VareANr
,sum(IIF(w789.prognosetype = 1, w789.Antall, NULL)) AS Antall1
,sum(IIF(w789.prognosetype = 1, w789.totkvant, NULL)) AS totkvant1
,sum(IIF(w789.prognosetype = 2, w789.Antall, NULL)) AS Antall2
,sum(IIF(w789.prognosetype = 2, w789.totkvant, NULL)) AS totkvant2
,w700.EksternKommentar1
FROM trading.W789Prognosekjopstat AS w789
INNER JOIN trading.W700Tilgangshode AS w700 ON w700.AccountId = w789.AccountId
AND w700.KlientNr = w789.Klientnr
AND w700.Tilgangsnr = w789.Tilgangsnr
AND w700.Tilgangstype = w789.Tilgangstype
AND w700.IsDeleted = 0
WHERE w789.IsDeleted = 0
GROUP BY w700.accountid
,w700.klientnr
,w700.tilgangstype
,w700.tilgangsnr
,w700.dato
,w700.Ankomstdato
,w700.LevNr
,w700.pakkemerke
,w789.VareANr
,w700.EksternKommentar1
) q1
FULL OUTER JOIN (
SELECT w700.accountid
,w700.klientnr
,w700.tilgangstype
,w700.tilgangsnr
,w700.dato
,w700.Ankomstdato
,w700.LevNr
,w700.pakkemerke
,w702.VareANr
,w702.Antall
,w702.TotKvant
,w702.ValPris
,w702.AntallTilFrys
,w702.TotKvantTilFrys
,w700.EksternKommentar1
,(SELECT MAX(LastUpdateDate) FROM (VALUES (w702.createdAt),(w702.updatedAt)) AS UpdateDate(LastUpdateDate)) AS 'Last Upsert'
FROM trading.w702PrognoseKjop w702
INNER JOIN trading.W700Tilgangshode AS w700 ON w700.AccountId = w702.AccountId
AND w700.KlientNr = w702.Klientnr
AND w700.Tilgangsnr = w702.Tilgangsnr
AND w700.Tilgangstype = w702.Tilgangstype
AND w700.IsDeleted = 0
WHERE w702.IsDeleted = 0
) q2 ON q1.accountid = q2.accountid
AND q1.klientnr = q2.klientnr
AND q1.tilgangstype = q2.tilgangstype
AND q1.tilgangsnr = q2.tilgangsnr
AND q1.vareanr = q2.vareanr
WHERE totkvant1 IS NOT NULL
OR totkvant2 IS NOT NULL
OR totkvant IS NOT NULL
Filtering with full join is really tricky. Your where criteria are actually turning the full join into a left join.
You can do what you want by filtering before the join:
select a.klientnr, a.uttakstype, a.uttaksnr, a.vareanr a.tableAitem,
b.vareanr b.tableBitem, a.kvantum a.tableAquantity, b.totkvant b.tableBquantity
from (select a.*
from tableA a
where a.UttaksNr = 639779 and a.IsDeleted = 0
) a full join
(select b.*
from tableB b
where b.IsDeleted = 0
) b
on a.klientnr = b.klientnr and
a.uttakstype= b.uttakstype and
a.uttaksnr = b.uttaksnr and
a.vareanr = b.vareanr;

SQL statement merge two rows into one

In the results of my sql-statement (SQL Server 2016) I would like to combine two rows with the same value in two columns ("study_id" and "study_start") into one row and keep the row with higest value in a third cell ("Id"). If any columns (i.e. "App_id" or "Date_arrival) in the row with higest Id is NULL, then it should take the value from the row with the lowest "Id".
I get the result below:
Id study_id study_start Code Expl Desc Startmonth App_id Date_arrival Efter_op Date_begin
167262 878899 954 4.1 udd.ord Afbrudt feb 86666 21-06-2012 N 17-08-2012
180537 878899 954 1 Afsluttet Afsluttet feb NULL NULL NULL NULL
And I would like to get this result:
Id study_id study_start Code Expl Desc Startmonth App_id Date_arrival Efter_op Date_begin
180537 878899 954 1 Afsluttet Afsluttet feb 86666 21-06-2012 N 17-08-2012
My statement looks like this:
SELECT dbo.PopulationStam_V.ELEV_ID AS id,
dbo.PopulationStam_V.PERS_ID AS study_id,
dbo.STUDIESTARTER.STUDST_ID AS study_start,
dbo.Optagelse_Studiestatus.AFGANGSARSAG AS Code,
dbo.Optagelse_Studiestatus.KORT_BETEGNELSE AS Expl,
ISNULL((CAST(dbo.Optagelse_Studiestatus.Studiestatus AS varchar(20))), 'Indskrevet') AS 'Desc',
dbo.STUDIESTARTER.OPTAG_START_MANED AS Startmonth,
dbo.ANSOGNINGER.ANSOG_ID as App_id,
dbo.ANSOGNINGER.ANKOMSTDATO AS Data_arrival',
dbo.ANSOGNINGER.EFTEROPTAG AS Efter_op,
dbo.ANSOGNINGER.STATUSDATO AS Date_begin
FROM dbo.INSTITUTIONER
INNER JOIN dbo.PopulationStam_V
ON dbo.INSTITUTIONER.INST_ID = dbo.PopulationStam_V.SEMI_ID
LEFT JOIN dbo.ANSOGNINGER
ON dbo.PopulationStam_V.ELEV_ID = dbo.ANSOGNINGER.ELEV_ID
INNER JOIN dbo.STUDIESTARTER
ON dbo.PopulationStam_V.STUDST_ID_OPRINDELIG = dbo.STUDIESTARTER.STUDST_ID
INNER JOIN dbo.UDD_NAVNE_T
ON dbo.PopulationStam_V.UDDA_ID = dbo.UDD_NAVNE_T.UDD_ID
INNER JOIN dbo.UDDANNELSER
ON dbo.UDD_NAVNE_T.UDD_ID = dbo.UDDANNELSER.UDDA_ID
LEFT OUTER JOIN dbo.PERSONER
ON dbo.PopulationStam_V.PERS_ID = dbo.PERSONER.PERS_ID
LEFT OUTER JOIN dbo.POSTNR
ON dbo.PERSONER.PONR_ID = dbo.POSTNR.PONR_ID
LEFT OUTER JOIN dbo.KønAlleElevID_V
ON dbo.PopulationStam_V.ELEV_ID = dbo.KønAlleElevID_V.ELEV_ID
LEFT OUTER JOIN dbo.Optagelse_Studiestatus
ON dbo.PopulationStam_V.AFAR_ID = dbo.Optagelse_Studiestatus.AFAR_ID
LEFT OUTER JOIN dbo.frafaldsmodel_adgangsgrundlag
ON dbo.frafaldsmodel_adgangsgrundlag.ELEV_ID = dbo.PopulationStam_V.ELEV_ID
LEFT OUTER JOIN dbo.Optagelse_prioriteterUFM
ON dbo.Optagelse_prioriteterUFM.cpr = dbo.PopulationStam_V.CPR_NR
AND dbo.Optagelse_prioriteterUFM.Aar = dbo.frafaldsmodel_adgangsgrundlag.optagelsesaar
LEFT OUTER JOIN dbo.frafaldsmodel_stoettetabel_uddannelser AS fsu
ON fsu.id_uddannelse = dbo.UDDANNELSER.UDDA_ID
AND fsu.id_inst = dbo.INSTITUTIONER.INST_ID
AND fsu.uddannelse_aar = dbo.frafaldsmodel_adgangsgrundlag.optagelsesaar
WHERE dbo.STUDIESTARTER.STUDIESTARTSDATO > '2012-03-01 00:00:00.000'
AND (dbo.Optagelse_Studiestatus.AFGANGSARSAG IS NULL
OR dbo.Optagelse_Studiestatus.AFGANGSARSAG NOT LIKE '2.7.4')
AND (dbo.PopulationStam_V.INDSKRIVNINGSFORM = '1100'
OR dbo.PopulationStam_V.INDSKRIVNINGSFORM = '1700')
GROUP BY dbo.PopulationStam_V.ELEV_ID,
dbo.PopulationStam_V.PERS_ID,
dbo.STUDIESTARTER.STUDST_ID,
dbo.Optagelse_Studiestatus.AFGANGSARSAG,
dbo.Optagelse_Studiestatus.KORT_BETEGNELSE,
dbo.STUDIESTARTER.OPTAG_START_MANED,
Studiestatus,
dbo.ANSOGNINGER.ANSOG_ID,
dbo.ANSOGNINGER.ANKOMSTDATO,
dbo.ANSOGNINGER.EFTEROPTAG,
dbo.ANSOGNINGER.STATUSDATO
I really hope somebody out there can help.
Many ways, this will work:
WITH subSource AS (
/* Your query here */
)
SELECT
s1.id,
/* all other columns work like this:
COALESCE(S1.column,s2.column)
for example: */
coalesce(s1.appid,s2.appid) as appid
FROM subSource s1
INNER JOIN subSource s2
ON s1.study_id =s2.study_id
and s1.study_start = s2.study_start
AND s1.id > s2.id
/* I imagine some other clauses might be needed but maybe not */
The rest is copy paste

sqlite query not getting all records if 1 table has missing data

I've got a very complex database with a lot of tables in SQLite. I'm trying to design a query that will report out a lot of data from those tables and also report out those sheep who may not have a record in one or more tables.
My query is:
SELECT sheep_table.sheep_id,
(SELECT tag_number FROM id_info_table WHERE official_id = "1" AND id_info_table.sheep_id = sheep_table.sheep_id AND (tag_date_off IS NULL or tag_date_off = '')) AS fedtag,
(SELECT tag_number FROM id_info_table WHERE tag_type = "4" AND id_info_table.sheep_id = sheep_table.sheep_id AND (tag_date_off IS NULL or tag_date_off = '')) AS farmtag,
(SELECT tag_number FROM id_info_table WHERE tag_type = "2" AND id_info_table.sheep_id = sheep_table.sheep_id AND (tag_date_off IS NULL or tag_date_off = '') and ( id_info_table.official_id is NULL or id_info_table.official_id = 0 )) AS eidtag,
sheep_table.sheep_name, codon171_table.codon171_alleles, sheep_ebv_table.usa_maternal_index, sheep_ebv_table.self_replacing_carcass_index, cluster_table.cluster_name, sheep_evaluation_table.id_evaluationid,
(sheep_table.birth_type +
sheep_table.codon171 +
sheep_evaluation_table.trait_score01 +
sheep_evaluation_table.trait_score02 +
sheep_evaluation_table.trait_score03 +
sheep_evaluation_table.trait_score04 +
sheep_evaluation_table.trait_score05 +
sheep_evaluation_table.trait_score06 +
sheep_evaluation_table.trait_score07 +
sheep_evaluation_table.trait_score08 +
sheep_evaluation_table.trait_score09 +
sheep_evaluation_table.trait_score10 +
(sheep_evaluation_table.trait_score11 / 10 )) as overall_score, sheep_evaluation_table.sheep_rank, sheep_evaluation_table.number_sheep_ranked,
sheep_table.alert01,
sheep_table.birth_date, sheep_sex_table.sex_abbrev, birth_type_table.birth_type,
sire_table.sheep_name as sire_name, dam_table.sheep_name as dam_name
FROM sheep_table
join codon171_table on sheep_table.codon171 = codon171_table.id_codon171id
join sheep_cluster_table on sheep_table.sheep_id = sheep_cluster_table.sheep_id
join cluster_table on cluster_table.id_clusternameid = sheep_cluster_table.which_cluster
join birth_type_table on sheep_table.birth_type = birth_type_table.id_birthtypeid
join sheep_sex_table on sheep_table.sex = sheep_sex_table.sex_sheepid
join sheep_table as sire_table on sheep_table.sire_id = sire_table.sheep_id
join sheep_table as dam_table on sheep_table.dam_id = dam_table.sheep_id
left outer join sheep_ebv_table on sheep_table.sheep_id = sheep_ebv_table.sheep_id
left outer join sheep_evaluation_table on sheep_table.sheep_id = sheep_evaluation_table.sheep_id
WHERE (sheep_table.remove_date IS NULL or sheep_table.remove_date is '' )
and (eval_date > "2014-10-03%" and eval_date < "2014-11%")
and sheep_ebv_table.ebv_date = "2014-11-01"
order by sheep_sex_table.sex_abbrev asc, cluster_name asc, self_replacing_carcass_index desc, usa_maternal_index desc, overall_score desc
If a given sheep does not have a record in the evaluation table or does not have a record in the EBV table no record is returned. I need all the current animals returned with all available data on them and just leave the fields for EBVs and evaluations null if they have no data.
I'm not understanding why I'm not getting them all since none of the sheep have all 3 ID types (federal, farm and EID) so there are nulls in those fields and I was expecting nulls in the evaluation sum and ebv fields as well.
Totally lost in what to do to fix it.
The problem would appear to be that you're using eval_date in the WHERE statement. I'm assuming that eval_date is in the sheep_evaluation_table, so when you use it in WHERE, it gets rid of any rows where eval_date is NULL, which it would be when you're using a LEFT OUTER JOIN and there's no matching record in sheep_evaluation_table.
Try putting the eval_date filter on the join instead, like this:
left outer join sheep_evaluation_table on sheep_table.sheep_id = sheep_evaluation_table.sheep_id
AND (eval_date > "2014-10-03%" and eval_date < "2014-11%")
WHERE (sheep_table.remove_date IS NULL or sheep_table.remove_date is '' )

Optimization DB2 query with mass join

I have complex query:
select rma.RELATION_MANAGER_ID,
rm.ORG_STRUCTURE_ID,
rm.RELATIONSHIP_MANAGER_NM,
count(distinct ppa.PARTY_ID) as count_party
from RELATIONSHIP_MANAGER rm --15808 row
join RELATIONSHIP_MANAGER_MARKET rmm --1560 row
on rm.RELATIONSHIP_MANAGER_ID = rmm.RELATIONSHIP_MANAGER_ID
and rmm.INCLUDE_IN_REPORT = 'Y'
join MARKET_SEGMENT rm_ms --4 row
on rmm.MARKET_SEGMENT_ID = rm_ms.MARKET_SEGMENT_ID
and rm_ms.MARKET_SEGMENT = '01'
join RELATIONSHIP_MANAGER_ALLOCATION rma --61349 row
on rm.RELATIONSHIP_MANAGER_ID = rma.RELATIONSHIP_MANAGER_ID
join CMD_PARTY_PORTFOLIO_ALLOCATION ppa --3114096 row
on ppa.PORTFOLIO_ID = rma.PORTFOLIO_ID
join person ps --3112575 row
on ps.IS_DELETED != 1 and ppa.party_id = ps.party_id
join PARTY p --3114146 row
on ppa.party_id=p.party_id
join MARKET_SEGMENT ms --4 row
on p.MARKET_SEGMENT_ID = ms.MARKET_SEGMENT_ID and ms.MARKET_SEGMENT = '01'
where rm.IS_CM = 1 and rm.IS_DELETED != 1
group by rm.RELATIONSHIP_MANAGER_NM, rma.RELATIONSHIP_MANAGER_ID, rm.ORG_STRUCTURE_ID
Table columns have indexes:
rm.RELATIONSHIP_MANAGER_ID,
rmm.RELATIONSHIP_MANAGER_ID,
rmm.MARKET_SEGMENT_ID,
rm_ms.MARKET_SEGMENT_ID,
rma.RELATIONSHIP_MANAGER_ID,
ppa.PORTFOLIO_ID,
rma.PORTFOLIO_ID,
ppa.party_id,
ps.party_id,
p.party_id,
p.MARKET_SEGMENT_ID,
ms.MARKET_SEGMENT_ID
tables PARTY, PERSON have ~1-3 million row,
runtime of query ~20second. I am comment
join MARKET_SEGMENT ms
on p.MARKET_SEGMENT_ID = ms.MARKET_SEGMENT_ID --and ms.MARKET_SEGMENT = '01'
runtime of query became ~3 second.
Explain why this is happening, please ?
Explain plan dont help me.. How i can optimization the query?
EDIT:
platform is DB2 for z/OS V9.7,
added size of table
EDIT2: explain plan shows that the first is always join the small size of the table
Just for grins, see if this makes any difference:
WITH
MktSeg( MARKET_SEGMENT_ID ) AS
( SELECT MARKET_SEGMENT_ID
FROM MARKET_SEGMENT
WHERE MARKET_SEGMENT = '01' )
select rma.RELATION_MANAGER_ID,
rm.ORG_STRUCTURE_ID,
rm.RELATIONSHIP_MANAGER_NM,
count(distinct ppa.PARTY_ID) as count_party
from RELATIONSHIP_MANAGER rm --15808 row
join RELATIONSHIP_MANAGER_MARKET rmm --1560 row
on rm.RELATIONSHIP_MANAGER_ID = rmm.RELATIONSHIP_MANAGER_ID
and rmm.INCLUDE_IN_REPORT = 'Y'
join MktSeg rm_ms --4 row
on rmm.MARKET_SEGMENT_ID = rm_ms.MARKET_SEGMENT_ID
join RELATIONSHIP_MANAGER_ALLOCATION rma --61349 row
on rm.RELATIONSHIP_MANAGER_ID = rma.RELATIONSHIP_MANAGER_ID
join CMD_PARTY_PORTFOLIO_ALLOCATION ppa --3114096 row
on ppa.PORTFOLIO_ID = rma.PORTFOLIO_ID
join person ps --3112575 row
on ps.IS_DELETED != 1 and ppa.party_id = ps.party_id
join PARTY p --3114146 row
on ppa.party_id=p.party_id
join MktSeg ms --4 row
on p.MARKET_SEGMENT_ID = ms.MARKET_SEGMENT_ID
WHERE rm.IS_CM = 1 AND rm.IS_DELETED != 1
group by rm.RELATIONSHIP_MANAGER_NM, rma.RELATIONSHIP_MANAGER_ID,
rm.ORG_STRUCTURE_ID, rma.RELATION_MANAGER_ID
Note also that I've added an item to the Group By clause.