SQL query using a case statement within the group by fields - sql

I have a complex query that joins different tables to get the count. There are a few fields to group by. Now, I want to add an additional field which needs a case statement. And this field also has to be in the group by list. My query originally looks like this -
SELECT DMAGATR.WRK_LOC_LEVEL4
, DMBR.WRK_LOC_NM
, DMBR.RELCD
, COUNT(DISTINCT DMBR.DMBRKEY) AS ELIG_COUNT
FROM DMBR
INNER JOIN DCUST DCUST ON DMBR.DCUSTKEY = DCUST.DCUSTKEY
INNER JOIN DMAGATR DMAGATR ON DMBR.DMBRKEY = DMAGATR.DMBRKEY
LEFT JOIN DMDYNATR DMDYNATR ON DMBR.DMBRKEY = DMDYNATR.DMBRKEY
WHERE DMBR.C_TIMESSTAMP <= '12/31/2011'
AND DMBR.RELCD IN ('0', '1')
AND DMBR.EE_STS IN ( 'A','L')
AND (DMBR.DEL_DT IS NULL
OR DMBR.DEL_DT > '12/31/2011')
AND DCUST.PRCD = 'TAR'
GROUP BY DMAGATR.WRK_LOC_LEVEL4, DMBR.WRK_LOC_NM, D_MEMBER.REL_CD
But the new field looks something like this -
(SELECT CASE
WHEN (DMBR.WRK_LOC_NM = '6' AND DMBR.GDR = 'M' AND DMBR.REL_CD in ('0','1')
AND DMBR.EE_STS IN ('A','L')) THEN 'SEG 1'
ELSE 'OTHER'
END
FROM DMBR) as CMPN
I tried to add it in the select list but it did not work. Then I added it in two places - in the select and also in the group by list. That did not work either.
The errors I got were:
ORA-00904 - CMPN not a valid column
ORACLE prepare error: ORA-22818: subquery expressions not allowed here.
I did some research online found examples that were close but not exactly identical to mine.
SQL GROUP BY CASE statement with aggregate function
Not sure if I understood the question here
SQL query with count and case statement
This is quite different from my need.
http://jerrytech.blogspot.com/2008/04/can-you-group-by-case-statement-in-sql.html
(this is close but I dont need the insert statements I tried this approach but it did not work for me)
Any suggestions would be appreciated.

I think the error is you are describing a FIELD (ie: result column) for the query like the others: DMAGATR.WRK_LOC_LEVEL4 ,DMBR.WRK_LOC_NM ,DMBR.RELCD ,COUNT (DISTINCT DMBR.DMBRKEY...
I think the error is that when using a SQL-Select statement for a resulting COLUMN, it must only return a single row. Since your query is just "... FROM DMBR ) as CMPN", you are returning more than one row for the field and no Database knows how to guess your result.
What you are probably missing is both a WHERE clause on the field, and possibly a GROUP by if you are looking for a distinct value from within the DMBR table.
Fix that and it should get you MUCH further along. Not knowing the rest of data structure or relationships, I can't figure what your ultimate result is meant to be.
ADDITIONAL COMMENT...
By looking at other answers provided, they have offered to do an immediate CASE WHEN on whatever the current "DMBR" record you are on, which would be correct, but not quite working. I think due to the two possible results, that too will have to be part of the group by.. as count(DISTINCT), the group by has to be based on any non-aggregation columns... of which, this case/when would be as such.. So your ultimate result would have
Lvl, Work Loc, RelCD, Case/when, count(distinct) where...
SEG 1 999
Other 999
Additionally, your CASE/WHEN had two components exactly matching your WHERE clause, so I took it out of there since no records of that set would have been returned anyway.
So, all that being said, I would write it as...
SELECT
DMAGATR.WRK_LOC_LEVEL4,
DMBR.WRK_LOC_NM,
DMBR.RELCD,
CASE WHEN (DMBR.WRK_LOC_NM = '6'
AND DMBR.GDR = 'M' )
THEN 'SEG 1'
ELSE 'OTHER'
END as WhenStatus,
COUNT (DISTINCT DMBR.DMBRKEY) AS ELIG_COUNT
FROM
DMBR
JOIN DCUST
ON DMBR.DCUSTKEY = DCUST.DCUSTKEY
JOIN DMAGATR
ON DMBR.DMBRKEY = DMAGATR.DMBRKEY
LEFT JOIN DMDYNATR
ON DMBR.DMBRKEY = DMDYNATR.DMBRKEY
WHERE
DMBR.C_TIMESSTAMP <= '12/31/2011'
AND DMBR.REL_CD in ('0','1')
AND DMBR.EE_STS IN ('A','L'))
AND DCUST.PRCD = 'TAR'
AND ( DMBR.DEL_DT IS NULL
OR DMBR.DEL_DT > '12/31/2011')
GROUP BY
DMAGATR.WRK_LOC_LEVEL4,
DMBR.WRK_LOC_NM,
D_MEMBER.REL_CD,
CASE WHEN (DMBR.WRK_LOC_NM = '6'
AND DMBR.GDR = 'M' )
THEN 'SEG 1'
ELSE 'OTHER'
END
Finally, sometimes, I've seen where a group by will choke on a complex column, such as a case / when. However, different servers allow ordinal reference to the group by (and order by too) positions. So, since the query has 4 non-aggregate columns (all listed first), then the count of distinct, you MIGHT be able to get away with changing the GROUP BY clause to...
GROUP BY 1, 2, 3, 4
All pertaining to the sequential order of columns STARTING the SQL-Select call.
--- CLARIFICATION about group by and case-sensitivity
First, the case-sensitivity, most engines are case-sensitive on keywords, hence CASE WHEN ... AND ... THEN ... ELSE ... END.
As for the "group by" (and also works for the "order by"), its more of a shortcut to the ordinal columns in your query instead of explicitly listing the long names to them and having to re-type the entire CASE construct a second time, you can just let the engine know which column of the result set you want to order by look at the following (unrelated) query...
select
lastname,
firstname,
sum( orderAmount ) TotalOrders
from
customerOrders
group by
lastname,
firstname
order by
TotalOrders DESC
and
select
lastname,
firstname,
sum( orderAmount ) TotalOrders
from
customerOrders
group by
1,
2
order by
3 DESC
Each would produce the same results... The fictitious customerOrders table would be pre-aggregated by last name and first name and show the total per person (all assuming no duplicate names for this example, otherwise, I would have used a customer ID). Once that is done, the order by kicks in and will put in order of the most sales to a given customer in DESCENDING order at the top of the list.
The numbers just represent the ordinal columns being returned in the query instead of long-hand typing the field names. More for the issue you have of your "CASE/WHEN" clause to prevent oops retyping and missing it up in the group by and pulling your hair out figuring out why.

You can also try this (derived subquery) approach if the other answers don't work:
SELECT
WRK_LOC_LEVEL4,
WRK_LOC_NM,
RELCD,
CMPN,
COUNT (DISTINCT DMBRKEY) AS ELIG_COUNT
FROM
( SELECT
DMAGATR.WRK_LOC_LEVEL4,
DMBR.WRK_LOC_NM,
DMBR.RELCD,
CASE WHEN (DMBR.WRK_LOC_NM = '6'
AND DMBR.GDR = 'M' )
THEN 'SEG 1'
ELSE 'OTHER'
END
AS CMPN,
DMBR.DMBRKEY
FROM
DMBR
JOIN DCUST
ON DMBR.DCUSTKEY = DCUST.DCUSTKEY
JOIN DMAGATR
ON DMBR.DMBRKEY = DMAGATR.DMBRKEY
LEFT JOIN DMDYNATR
ON DMBR.DMBRKEY = DMDYNATR.DMBRKEY
WHERE
DMBR.C_TIMESSTAMP <= '12/31/2011'
AND DMBR.REL_CD in ('0','1')
AND DMBR.EE_STS IN ('A','L'))
AND DCUST.PRCD = 'TAR'
AND ( DMBR.DEL_DT IS NULL
OR DMBR.DEL_DT > '12/31/2011')
) AS TMP
GROUP BY
WRK_LOC_LEVEL4,
WRK_LOC_NM,
REL_CD,
CMPN

I don't know exactly what you meant by "in the SELECT list". I don't know why CMPN includes its own SELECT. Are you trying to do the following, and if not, what different is the goal?
SELECT
DMAGATR.WRK_LOC_LEVEL4
,DMBR.WRK_LOC_NM
,DMBR.RELCD
,COUNT (DISTINCT DMBR.DMBRKEY) AS ELIG_COUNT
,(CASE
WHEN (DMBR.WRK_LOC_NM = '6'
AND DMBR.GDR = 'M'
AND DMBR.REL_CD in ('0','1')
AND DMBR.EE_STS IN ('A','L'))
THEN 'SEG 1'
ELSE 'OTHER'
END
) as CMPN
FROM DMBR
INNER JOIN DCUST DCUST
ON DMBR.DCUSTKEY = DCUST.DCUSTKEY
INNER JOIN DMAGATR DMAGATR
ON DMBR.DMBRKEY = DMAGATR.DMBRKEY
LEFT JOIN DMDYNATR DMDYNATR
ON DMBR.DMBRKEY = DMDYNATR.DMBRKEY
WHERE DMBR.C_TIMESSTAMP <= '12/31/2011'
AND DMBR.RELCD IN ('0', '1')
AND DMBR.EE_STS IN ( 'A','L')
AND (DMBR.DEL_DT IS NULL
OR DMBR.DEL_DT > '12/31/2011')
AND DCUST.PRCD = 'TAR'
GROUP BY
DMAGATR.WRK_LOC_LEVEL4
,DMBR.WRK_LOC_NM
,D_MEMBER.REL_CD
,(CASE
WHEN (DMBR.WRK_LOC_NM = '6'
AND DMBR.GDR = 'M'
AND DMBR.REL_CD in ('0','1')
AND DMBR.EE_STS IN ('A','L'))
THEN 'SEG 1'
ELSE 'OTHER'
END)

SELECT
DMAGATR.WRK_LOC_LEVEL4
,DMBR.WRK_LOC_NM
,DMBR.RELCD
,COUNT (DISTINCT DMBR.DMBRKEY) AS ELIG_COUNT,
(SELECT
CASE
WHEN (DMBR.WRK_LOC_NM = '6'
AND DMBR.GDR = 'M'
AND DMBR.REL_CD in ('0','1')
AND DMBR.EE_STS IN ('A','L'))
THEN 'SEG 1'
ELSE 'OTHER'
END
) as CMPN
FROM DMBR
INNER JOIN DCUST DCUST
ON DMBR.DCUSTKEY = DCUST.DCUSTKEY
INNER JOIN DMAGATR DMAGATR
ON DMBR.DMBRKEY = DMAGATR.DMBRKEY
LEFT JOIN DMDYNATR DMDYNATR
ON DMBR.DMBRKEY = DMDYNATR.DMBRKEY
WHERE DMBR.C_TIMESSTAMP <= '12/31/2011'
AND DMBR.RELCD IN ('0', '1')
AND DMBR.EE_STS IN ( 'A','L')
AND (DMBR.DEL_DT IS NULL
OR DMBR.DEL_DT > '12/31/2011')
AND DCUST.PRCD = 'TAR'
GROUP BY
DMAGATR.WRK_LOC_LEVEL4
,DMBR.WRK_LOC_NM
,D_MEMBER.REL_CD, DMBR.GDR, DBMR.EE_STS

Related

Query is returning to rows instead of just one

I have a a query to return the dimensions of a package in M2 (square metres) and UN (unity's). With the current query it is returning two different lines, because I am using a CASE WHEN. This is the query:
SELECT DISTINCT(C.Package) 'Package',
CASE S.Unity WHEN 'M2' THEN SUM(L.Qt*S.ConvEst) ELSE NULL END 'M2',
CASE S.Unity WHEN 'UN' THEN SUM(L.Qt) ELSE NULL END 'UN'
FROM
PackageTable AS C
INNER JOIN
PackageTableRows L ON L.Package = C.Package
INNER JOIN
Products S ON S.Product = L.Product
WHERE
C.Package = '587496'
GROUP BY
C.Package, S.Unity
This result:
But what I really want is the query to return is something like this:
With only one line. I know for that I am not using CASE WHEN correctly and that is why I need your help.
You have several problems here. Firstly, DISTINCT is not a function it's an operator. DISTINCT affects the entire dataset and causes only distinct rows to be returned. It's not DISTINCT ({Column Name}) it's SELECT DISTINCT {Columns}.
Next, you have both DISTINCT and GROUP BY; this is a flaw. A GROUP BY clause already causes your data to be returned in distinct groups, so a DISTINCT is both redundant and unneeded overhead. Get rid of the DISTINCT. If you are getting different results when you have a DISTINCT with a GROUP BY this is a strong indication that your GROUP BY clause is wrong and needs addressing (most likely you have too many columns in the clause).
Finally, when performing conditional aggregation the aggregate function should be around the entire CASE expression, not an expression in the THEN. Then also means that you then need to remove the column in your WHEN clause from the GROUP BY as I suspect the only reason you have it there is because you had to:
This results in:
SELECT C.Package AS Package,
SUM(CASE S.Unity WHEN 'M2' THEN L.Qt * S.ConvEst END) AS M2,
SUM(CASE S.Unity WHEN 'UN' THEN L.Qt END) AS UN
FROM dbo.PackageTable C
INNER JOIN dbo.PackageTableRows L ON L.Package = C.Package
INNER JOIN dbo.Products S ON S.Product = L.Product
WHERE C.Package = '587496'
GROUP BY C.Package;
It's mostly correct. You need to GROUP BY only on C.Package to bring it into a single line. For this it should return 0 for case else conditions and aggregation should be on the full case conditions rather than only on the measure.
So it will look like this.
SELECT C.Package 'Package',
SUM(CASE S.Unity WHEN 'M2' THEN (L.Qt*S.ConvEst) ELSE 0 END ) as 'M2',
SUM(CASE S.Unity WHEN 'UN' THEN SL.Qt ELSE 0 END) AS 'UN'
FROM PackageTable AS C
INNER JOIN PackageTableRows L ON L.Package=C.Package
INNER JOIN Products S ON S.Product=L.Product
WHERE C.Package='587496'
GROUP BY C.Package

Arithmetic overflow error converting varchar to data type numeric CASE statement

I am trying to write a query that returns an "Estimated Annual Value", and for this, I am using a Case statement. There are two inner joins involved in the query.
So, the query and gives me result when I am running this piece:
select Users.Id, Name, PhoneNumber, Email, Currency, count(*) as TotalOrders, sum(TotalCustomerAmount) as TotalOrderValue, avg(TotalCustomerAmount) as AverageOrderValue, TsCreate as RegistrationDate, max(TsOrderPlaced) as LastOrderDate, min(TsOrderPlaced) as FirstOrderDate,
CASE
When PromotionsEnabled = 0 then 'Y'
When PromotionsEnabled = 1 then 'n'
else 'undefined'
end as Promotions,
/*CASE
When ((DATEDIFF(day, max(TsOrderPlaced), min(TsOrderPlaced)) >= 6) AND (count(*) > 3)) then ((sum(TotalCustomerAmount)/(DATEDIFF(day, max(TsOrderPlaced), min(TsOrderPlaced))))*365)
Else '-'
end as EstimatedAnnualValue*/
from AspNetUsers with (nolock)
inner join Orders with (nolock) on Orders.UserId = AspNetUsers.Id and Orders.WhiteLabelConfigId = #WhiteLabelConfigId
and Orders.OrderState in (2,3,4,12)
inner join UserWhiteLabelConfigs with (nolock) on UserWhiteLabelConfigs.UserId = AspNetUsers.Id and Orders.WhiteLabelConfigId = #WhiteLabelConfigId
where AspNetUsers.Discriminator = 'ApplicationUser'
group by Users.Id, Name, Number, Currency, Email, TsCreate, PromotionsEnabled
But the problem comes when I am running this with the second CSAE statement, is it because I cannot use the aggregate function before GROUP BY? I am also thinking of using a Subquery
Looking fr some help!!
You need to use aggregation functions. For instance, if you want 'Y' only when all values are 0 or NULL:
(case when max(PromotionsEnabled) = 0 then 'Y'
when max(PromotionsEnabled) = 1 then 'N'
else 'undefined'
end) as Promotions,
I'm not sure if this is the logic you want (because that detail is not in the question). However, this shows that you can use aggregation functions in a case expression.

Using a CASE WHEN statement and an IN (SELECT...FROM) subquery

I'm trying to create a temp table and build out different CASE WHEN logic for two different medications. In short I have two columns of interest for these CASE WHEN statements; procedure_code and ndc_code. There are only 3 procedure codes that I need, but there are about 20 different ndc codes. I created a temp.ndcdrug1 temp table with these ndc codes for medication1 and temp.ndcdrug2 for the ndc codes for medication2 instead of listing out each ndc code individually. My query looks like this:
CREATE TABLE temp.flags AS
SELECT DISTINCT a.userid,
CASE WHEN (procedure_code = 'J7170' OR ndc_code in (select ndc_code from temp.ndcdrug1)) THEN 'Y' ELSE 'N' END AS Drug1,
CASE WHEN (procedure_code = 'J7205' OR procedure_code = 'C9136' OR ndc_code in (select ndc_code from temp.ndcdrug2)) THEN 'Y' ELSE 'N' END AS Drug2,
CASE WHEN (procedure_code = 'J7170' AND procedure_code = 'J7205') THEN 'Y' ELSE 'N' END AS Both
FROM table1 a
LEFT JOIN table2 b
ON a.userid = b.userid
WHERE...
AND...
When I run this, it returns: org.apache.spark.sql.AnalysisException: IN/EXISTS predicate sub-queries can only be used in a Filter.
I could list these ndc_code values out individually, but there are a lot of them so wanted a more efficient way of going about this. Is there a way to use a sub select query like this when writing out CASE WHEN's?
Query.
CREATE TABLE temp.flags AS
SELECT DISTINCT a.userid,
CASE WHEN (
procedure_code = 'J7170' OR
(select min('1') from temp.ndcdrug1 m where m.ndc_code = a.ndc_code) = '1'
) THEN 'Y' ELSE 'N' END AS Drug1,
CASE WHEN (
procedure_code = 'J7205' OR
procedure_code = 'C9136' OR
(select min('1') from temp.ndcdrug2 m where m.ndc_code = a.ndc_code) = '1'
) THEN 'Y' ELSE 'N' END AS Drug2,
CASE WHEN (procedure_code = 'J7170' AND procedure_code = 'J7205')
THEN 'Y' ELSE 'N' END AS Both
FROM table1 a
LEFT JOIN table2 b
ON a.userid = b.userid
WHERE...
AND...

SQL Combining Two Totally seperate tables to one

I am VERY new to SQL and self taught. I have two SQL that I stuggled through but got working. Now I need to combine them into one and I'm lost.
SELECT
s.lastfirst,
s.student_number,
SUM(tr.howmany)
FROM
students s
JOIN
truancies tr ON s.id = tr.studentid
WHERE
s.enroll_status = 0 AND
s.schoolid = ~(curschoolid)
GROUP BY
s.lastfirst, s.student_number
HAVING
SUM(tr.howmany) > 0
ORDER BY
s.lastfirst
And this table:
SELECT
S.DCID as DCID,
S.LASTFIRST as LASTFIRST,
S.STUDENT_NUMBER as STUDENT_NUMBER,
S2.FC_SRVC_HRS_DUE as FC_SRVC_HRS_DUE,
CASE
WHEN S2.FC_SRVC_HRS_COMPLETED IS NULL
THEN '0'
ELSE S2.FC_SRVC_HRS_COMPLETED
END AS FC_SRVC_HRS_COMPLETED,
S2.FC_SRVC_HRS_BUYOUT as FC_SRVC_HRS_BUYOUT,
CASE
WHEN S2.FC_SRVC_HRS_COMPLETED IS NULL
THEN S2.FC_SRVC_HRS_DUE * S2.FC_SRVC_HRS_BUYOUT
ELSE ((S2.FC_SRVC_HRS_DUE - S2.FC_SRVC_HRS_COMPLETED) * S2.FC_SRVC_HRS_BUYOUT)
END as Balance_Due
FROM
STUDENTS S, U_STUDENTSUSERFIELDS S2
WHERE
S.DCID = S2.STUDENTSDCID AND
s.enroll_status = 0 AND
s.schoolid = ~(curschoolid) AND
(((S2.FC_SRVC_HRS_DUE - S2.FC_SRVC_HRS_COMPLETED) * S2.FC_SRVC_HRS_BUYOUT) > 0 OR
((S2.FC_SRVC_HRS_DUE - S2.FC_SRVC_HRS_COMPLETED) * S2.FC_SRVC_HRS_BUYOUT) IS NULL) AND
S2.FC_SRVC_HRS_DUE >.1
ORDER BY
s.lastfirst
What I am really looking for are the totals of both of these. I want the SUM(tr.howmany) from the first table and the balance due of the second BUT I need the filters that are in there. This would be sorted by student. I hope I am making sense. Any assistance would be appreciated.
You can join together 2 separate SQL select statements:
Eg:
Select A.id, A.value, B.value
From (select id, count(*) as value from TableA ...) AS A
join (select id, sum(field) as value from TableB ...) AS B
on A.id = B.id
order by A.id
As long as you have a common field to join on this would work. In your case the student_number looks like a good candidate. You will have to do the ordering outside of your subqueries.

Oracle SQL Optimization

I am creating a report for a client to retrieve invoices and their ship to location. Currently the query is very slow and I would be grateful if anyone can advise how to optimize the query. I do have lot of outer joins in the query and I believe this might be the issue.
Any advice would be appreciated.
SELECT
b.operating_unit,
b.trading_partner,
b.invoice_date,
b.type,
b.gl_date,
b.invoice_num,
b.invoice_id,
b.quantity,
b.unit_price,
b.uom,
b.invoice_currency_code,
b.payment_method_code,
b.terms,
b.ITEM_DESCRIPTION,
b.LINE_NUMBER,
b.item_code,
b.VAT_CODE,
b.amount,
b.invoice_amount,
b.vat_amount,
b.discount_amount,
b.price_variance,
b.total_amount,
b.status,
b.ship_to_location
FROM(
SELECT
a.operating_unit,
a.trading_partner,
a.invoice_date,
a.type,
a.gl_date,
a.invoice_num,
a.invoice_id,
a.quantity,
ROUND(a.unit_price,2) unit_price,
a.uom,
a.invoice_currency_code,
a.payment_method_code,
a.terms,
a.LINE_NUMBER,
a.ITEM_DESCRIPTION,
a.item_code,
a.VAT_CODE,
--CASE WHEN a.status = 'CANCELLED' THEN NULL
--ELSE a.VAT_CODE END AS VAT_CODE,
sum(a.AMOUNT) amount,
a.invoice_amount,
sum(a.vat_amount) vat_amount,
sum(a.discount_amount) discount_amount,
sum(a.price_variance) price_variance,
sum(a.total_amount) total_amount,
CASE WHEN a.status = 'FULLY' THEN 'Fully Applied'
WHEN a.status = 'UNAPPROVED' THEN 'Unvalidated'
WHEN a.status = 'NEEDS REAPPROVAL' THEN 'Needs Revalidation'
WHEN a.status = 'APPROVED' THEN 'Validated'
WHEN a.status = 'NEVER APPROVED' THEN 'Never Validated'
WHEN a.status = 'CANCELLED' THEN 'Cancelled'
WHEN a.status = 'UNPAID' THEN 'Unpaid'
WHEN a.status = 'AVAILABLE' THEN 'Available'
END AS status,
a.ship_to_location
from(
Select
hz.name operating_unit,
aia.INVOICE_TYPE_LOOKUP_CODE type,
aps.vendor_name trading_partner,
aia.INVOICE_DATE,
aia.gl_date,
aia.invoice_num,
aia.INVOICE_AMOUNT,
aida.invoice_id,
APIDA.LINE_NUMBER,
APIDA.QUANTITY_INVOICED QUANTITY,
APIDA.UNIT_PRICE,
APIDA.UOM,
aia.INVOICE_CURRENCY_CODE,
aia.PAYMENT_METHOD_CODE,
apt.name terms,
APIDA.ITEM_DESCRIPTION,
APIDA.ITEM_CODE,
case when apida.line_type_lookup_code <> 'IPV' THEN APIDA.AMOUNT ELSE 0 END AS AMOUNT,
--case when aida.line_type_lookup_code = 'REC_TAX' THEN aida.RECOVERY_RATE_NAME END AS VAT_CODE,
APIDA.VAT_CODE,
APIDA.VAT_AMOUNT,
0 DISCOUNT_AMOUNT,
case when apida.line_type_lookup_code = 'IPV' THEN APIDA.AMOUNT ELSE 0 END AS PRICE_VARIANCE,
APIDA.TOTAL_AMOUNT,
APPS.AP_INVOICES_PKG.GET_APPROVAL_STATUS
(
aia.INVOICE_ID
,aia.INVOICE_AMOUNT
,aia.PAYMENT_STATUS_FLAG
,aia.INVOICE_TYPE_LOOKUP_CODE
) status,
APIDA.SHIP_TO_LOCATION
--b.description ship_to_location
from (SELECT AILA.INVOICE_ID, AILA.LINE_TYPE_LOOKUP_CODE, AILA.LINE_NUMBER, aila.TAX_CLASSIFICATION_CODE VAT_CODE, AILA.QUANTITY_INVOICED, AILA.UNIT_PRICE, AILA.UNIT_MEAS_LOOKUP_CODE UOM, X.invoice_distribution_id, X.Description ITEM_DESCRIPTION, msi.segment1 item_code, NVL(X.AMOUNT, 0) AMOUNT, NVL(B.TAX_AMOUNT,0) VAT_AMOUNT, (NVL(X.AMOUNT, 0) + NVL(B.TAX_AMOUNT,0)) TOTAL_AMOUNT,HR.DESCRIPTION SHIP_TO_LOCATION
FROM ap_invoice_lines_all aila, ap_invoice_distributions_all X, MTL_SYSTEM_ITEMS msi, hr_locations hr,
(SELECT A.INVOICE_ID, A.LINE_TYPE_LOOKUP_CODE, A.INVOICE_LINE_NUMBER, A.CHARGE_APPLICABLE_TO_DIST_ID, SUM(A.AMOUNT) TAX_AMOUNT
FROM ap_invoice_distributions_all A
WHERE 1=1
AND A.LINE_TYPE_LOOKUP_CODE = 'REC_TAX'
GROUP BY A.INVOICE_ID, A.LINE_TYPE_LOOKUP_CODE, A.INVOICE_LINE_NUMBER, A.CHARGE_APPLICABLE_TO_DIST_ID) B
WHERE AILA.INVOICE_ID = X.INVOICE_ID(+)
AND X.INVOICE_ID = B.INVOICE_ID(+)
AND X.invoice_distribution_id = B.CHARGE_APPLICABLE_TO_DIST_ID(+)
and msi.inventory_item_id(+) = aila.inventory_item_id
AND AILA.SHIP_TO_LOCATION_ID = HR.SHIP_TO_LOCATION_ID
and aila.line_number = X.INVOICE_LINE_NUMBER(+)
AND AILA.LINE_TYPE_LOOKUP_CODE != 'REC_TAX' AND AILA.LINE_TYPE_LOOKUP_CODE != 'NONREC_TAX'
--AND AILA.INVOICE_ID = '10T52233547'
)APIDA,
ap_invoice_distributions_all aida, ap_invoices_all aia, ap_suppliers aps, ap_terms apt, hr_organization_units hz
where aia.invoice_id = aida.invoice_id(+)
and aia.invoice_id = APIDA.invoice_id(+)
and aps.vendor_id = aia.vendor_id
and apt.term_id = aia.terms_id
and hz.ORGANIZATION_ID = aia.org_id
--and aia.invoice_NUM = '123456'
--and aida.LINE_TYPE_LOOKUP_CODE = 'REC_TAX'
/* Parameters */
and hz.ORGANIZATION_ID between NVL(:p_operating_unit_from, hz.ORGANIZATION_ID) and NVL(:p_operating_unit_to, hz.ORGANIZATION_ID)
and aia.INVOICE_DATE between NVL(:p_invoice_date_from, aia.INVOICE_DATE) and NVL(:p_invoice_date_to, aia.INVOICE_DATE)
and aps.vendor_id between NVL(:p_trading_partner_from, aps.vendor_id) and NVL(:p_trading_partner_to, aps.vendor_id)
and aia.gl_date between NVL(:p_gl_date_from, aia.gl_date) and NVL(:p_gl_date_to, aia.gl_date)
order by hz.name, aps.vendor_name, APIDA.SHIP_TO_LOCATION, APIDA.ITEM_DESCRIPTION)a
group by a.operating_unit,
a.trading_partner,
a.invoice_date,
a.gl_date,
a.invoice_num,
a.invoice_amount,
a.invoice_id,
a.invoice_currency_code,
a.payment_method_code,
a.terms,
a.ITEM_DESCRIPTION,
a.item_code,
a.VAT_CODE,
a.quantity,
a.unit_price,
a.LINE_NUMBER,
a.uom,
a.status,
a.type,
a.ship_to_location)b
order by b.operating_unit, b.trading_partner, b.ship_to_location, b.invoice_date, b.gl_date, b.invoice_num, b.LINE_NUMBER
You are using ORDER BY in a subquery, which has no effect, so you should remove it. If the optimizer doesn't already notice and ignore it, this would speed up the query.
The comma-separated join syntax you are using was standard in the 1980s, but is considered error-prone and hard to read now. You may want to replace it with explicite joins (e.g. INNER JOIN and LEFT OUTER JOIN) for readability and maintainability. However, I see no errors in your joins, so you would gain nothing else than that.
The outmost query is superfluous, but shouldn't cause any slowdown. So no problem either.
This is simply a rather complicated query with some outer joins and two group by clauses. No obvious error. Check if you have indexes on the columns used in your where clauses. You may want some composite Indexes, too. I expect all IDs to be indexed already. I suggest the following three additional indexes:
ap_invoice_lines_all(invoice_id, line_number)
ap_invoice_distributions_all(invoice_id, invoice_line_number)
ap_invoice_distributions_all(line_type_lookup_code)
I am a newbie to Oracle SQL. I did my investigation and the issue is not with the outer joins but rather with the distinct. It has to group and sort a large amount of data and this is causing the delay.
I have to rework this code and avoid using DISTINCT. When I have a solution, I will post here for other newbies like me can learn something new.