Query Optimization for if exists sub query - sql

I am trying to optimize the query below
if exists (select 1
from GHUB_DISCREPANCY_REPORT (NOLOCK)
where PARTNO = #currentpn and orderID = #oldorderid + 1
and (Discr_Fox_Available = 'Y'
or Discr_Fox_NC = 'Y' or Discr_FOC_Available = 'Y'
or Discr_FOC_NC = 'Y' or Discr_Cpa_Available = 'Y'
or Discr_Cpa_NC = 'Y' or Discr_Fox_Tot = 'Y'
or Discr_FOC_Tot = 'Y' or Discr_Cpa_Tot = 'Y'))
I indexed the primary key, PartNo, Aging and OrderID columns.
Is there any other way I can optimize this query ?
Please suggest!

First, try an index on GHUB_DISCREPANCY_REPORT(PARTNO, orderId). This may be a big help for you query.
If you still have performance problems, one method is to use to separate queries, each of which can be optimized with a separate index.
if exists (select 1
from GHUB_DISCREPANCY_REPORT (NOLOCK)
where PARTNO = #currentpn and orderID = #oldorderid and Discr_Fox_Available = 'Y'
) or
. . .
And then having a separate composite index for each combination: GHUB_DISCREPANCY_REPORT(PARTNO, orderId, Discr_Fox_Available). This is a lot of index overhead, but could be worth it.
Another idea is to combine all the flags into one:
alter table GHUB_DISCREPANCY_REPORT
add Any_Flags as (case when (Discr_Fox_Available = 'Y'
or Discr_Fox_NC = 'Y' or Discr_FOC_Available = 'Y'
or Discr_FOC_NC = 'Y' or Discr_Cpa_Available = 'Y'
or Discr_Cpa_NC = 'Y' or Discr_Fox_Tot = 'Y'
or Discr_FOC_Tot = 'Y' or Discr_Cpa_Tot = 'Y' then 'Y' else 'N' end);
You can add an index on a computed column and then use the value in your query:
create index idx_GHUB_DISCREPANCY_REPORT_anyflags on GHUB_DISCREPANCY_REPORT(PARTNO, OrderId, AnyFlags);
if exists (select 1
from GHUB_DISCREPANCY_REPORT (NOLOCK)
where PARTNO = #currentpn and orderID = #oldorderid and AnyFlags = 'Y'
)

"top 1 1 from tbl_name" gives much better performance than " select 1" in the "if exists" (sub query) part.
Refer to this discussion
After using "top 1 1" estimated number of rows is reduce to 1 from 135765 in my case.

Related

Using a CASE WHEN statement and an IN (SELECT...FROM) subquery

I'm trying to create a temp table and build out different CASE WHEN logic for two different medications. In short I have two columns of interest for these CASE WHEN statements; procedure_code and ndc_code. There are only 3 procedure codes that I need, but there are about 20 different ndc codes. I created a temp.ndcdrug1 temp table with these ndc codes for medication1 and temp.ndcdrug2 for the ndc codes for medication2 instead of listing out each ndc code individually. My query looks like this:
CREATE TABLE temp.flags AS
SELECT DISTINCT a.userid,
CASE WHEN (procedure_code = 'J7170' OR ndc_code in (select ndc_code from temp.ndcdrug1)) THEN 'Y' ELSE 'N' END AS Drug1,
CASE WHEN (procedure_code = 'J7205' OR procedure_code = 'C9136' OR ndc_code in (select ndc_code from temp.ndcdrug2)) THEN 'Y' ELSE 'N' END AS Drug2,
CASE WHEN (procedure_code = 'J7170' AND procedure_code = 'J7205') THEN 'Y' ELSE 'N' END AS Both
FROM table1 a
LEFT JOIN table2 b
ON a.userid = b.userid
WHERE...
AND...
When I run this, it returns: org.apache.spark.sql.AnalysisException: IN/EXISTS predicate sub-queries can only be used in a Filter.
I could list these ndc_code values out individually, but there are a lot of them so wanted a more efficient way of going about this. Is there a way to use a sub select query like this when writing out CASE WHEN's?
Query.
CREATE TABLE temp.flags AS
SELECT DISTINCT a.userid,
CASE WHEN (
procedure_code = 'J7170' OR
(select min('1') from temp.ndcdrug1 m where m.ndc_code = a.ndc_code) = '1'
) THEN 'Y' ELSE 'N' END AS Drug1,
CASE WHEN (
procedure_code = 'J7205' OR
procedure_code = 'C9136' OR
(select min('1') from temp.ndcdrug2 m where m.ndc_code = a.ndc_code) = '1'
) THEN 'Y' ELSE 'N' END AS Drug2,
CASE WHEN (procedure_code = 'J7170' AND procedure_code = 'J7205')
THEN 'Y' ELSE 'N' END AS Both
FROM table1 a
LEFT JOIN table2 b
ON a.userid = b.userid
WHERE...
AND...

How can I prevent a Lazy Spool from happening in my query?

I been struggling to optimize this query,
SELECT
dbo.OE61BLIN.Order_Key
,dbo.OE61BLIN.Doc_Type
,dbo.OE61BHED.Doc__
,dbo.OE61BHED.Inv_Date
,dbo.OE61BHED.Cust__
,dbo.OE61BLIN.Line_Type
,dbo.OE61BLIN.Item__
,dbo.OE61BLIN.Description
,(CASE
WHEN dbo.OE61BLIN.Doc_Type = 'I' THEN dbo.OE61BLIN.Qty_Shipped * dbo.OE61BLIN.Unit_Factor
WHEN dbo.OE61BLIN.Doc_Type = 'C' AND
dbo.OE61BLIN.return_to_inventory_ = 1 THEN -dbo.OE61BLIN.Qty_Shipped * dbo.OE61BLIN.Unit_Factor
ELSE 0
END) AS QTY
,(CASE
WHEN dbo.OE61BLIN.Doc_Type = 'I' THEN dbo.OE61BLIN.Ext_Price
WHEN dbo.OE61BLIN.Doc_Type = 'C' THEN -dbo.OE61BLIN.Ext_Price
ELSE 0
END) * (CASE
WHEN ISNULL(dbo.OE61BHED.Inv_Disc__, 0) <> 0 THEN 1 - (dbo.OE61BHED.Inv_Disc__ / 100)
ELSE 1
END)
AS amount
,dbo.OE61BHED.Inv_Disc__
,dbo.OE61BLIN.ITEM_GROUP
,dbo.OE61BLIN.Category
,ISNULL(dbo.AR61ACST.intercompany, 0) AS intercompany
FROM dbo.OE61BHED
LEFT OUTER JOIN dbo.AR61ACST
ON dbo.OE61BHED.Cust__ = dbo.AR61ACST.Cust__
RIGHT OUTER JOIN dbo.OE61BLIN
ON dbo.OE61BHED.Order_Key = dbo.OE61BLIN.Order_Key
WHERE (dbo.OE61BLIN.Line_Type = 'R')
AND isnull(intercompany,0) != 1
AND (dbo.OE61BLIN.Doc_Type = 'C'
OR dbo.OE61BLIN.Doc_Type = 'I')
Complete estimated execution plan is here
https://www.brentozar.com/pastetheplan/?id=S1htt0rxN
Actual Exectuion Plan
https://www.brentozar.com/pastetheplan/?id=BymztxLgE
I use SQL Sentry Plan Explorer to optimaze it ,
and it suggested that I should add the following two indexes, which I have
But it doesnt improve much, It only removed RID Look Up from plan.
CREATE NONCLUSTERED INDEX [XI_LineTypeDocType_OE61BLIN_12172018]
ON [dbo].[OE61BLIN] ([Line_Type],[Doc_Type])
INCLUDE ([Order_Key],[Item__],[Description],[Category],[Return_to_Inventory_],[Unit_Factor],[Qty_Shipped],[Ext_Price],[ITEM_GROUP])
CREATE INDEX [XI_CustIntercompany_AR67ACST_12172018] ON [GarbageMark].[dbo].[AR61ACST]
([Cust__] ASC)
INCLUDE ([Intercompany])
I am completely stuck on how to aproach this problem.
I see that Lazy Spool is the most expensive operation but I dont know how to remove
or substitute.
Regrettably you don't prefix intercompany in the where clause with its table name so to some extent I'm guessing that the changes you see below. I am going to suggest that you re-arrange your query to avoid the use of right outer join and then, perhaps more importantly, place the intercompany <> 1 condition directly into the left join which
removes the use of ISNULL() from your where clause.
SELECT
dbo.OE61BLIN.Order_Key
, dbo.OE61BLIN.Doc_Type
, dbo.OE61BHED.Doc__
, dbo.OE61BHED.Inv_Date
, dbo.OE61BHED.Cust__
, dbo.OE61BLIN.Line_Type
, dbo.OE61BLIN.Item__
, dbo.OE61BLIN.Description
, (CASE
WHEN dbo.OE61BLIN.Doc_Type = 'I' THEN dbo.OE61BLIN.Qty_Shipped * dbo.OE61BLIN.Unit_Factor
WHEN dbo.OE61BLIN.Doc_Type = 'C' AND
dbo.OE61BLIN.return_to_inventory_ = 1 THEN -dbo.OE61BLIN.Qty_Shipped * dbo.OE61BLIN.Unit_Factor
ELSE 0
END) AS QTY
, (CASE
WHEN dbo.OE61BLIN.Doc_Type = 'I' THEN dbo.OE61BLIN.Ext_Price
WHEN dbo.OE61BLIN.Doc_Type = 'C' THEN -dbo.OE61BLIN.Ext_Price
ELSE 0
END) * (CASE
WHEN ISNULL( dbo.OE61BHED.Inv_Disc__, 0 ) <> 0 THEN 1 - (dbo.OE61BHED.Inv_Disc__ / 100)
ELSE 1
END)
AS amount
, dbo.OE61BHED.Inv_Disc__
, dbo.OE61BLIN.ITEM_GROUP
, dbo.OE61BLIN.Category
, ISNULL( dbo.AR61ACST.intercompany, 0 ) AS intercompany
FROM dbo.OE61BLIN
INNER JOIN dbo.OE61BHED ON dbo.OE61BLIN.Order_Key = dbo.OE61BHED.Order_Key
LEFT OUTER JOIN dbo.AR61ACST ON dbo.OE61BHED.Cust__ = dbo.AR61ACST.Cust__
AND dbo.AR61ACST.intercompany != 1
WHERE dbo.OE61BLIN.Line_Type = 'R'
AND dbo.OE61BLIN.Doc_Type IN ('C','I')
;
I believe the join between OE61BLIN and OE61BHED can be an inner join, if not try using a left join.

Optimizing query with multiple CASEs with Count depending on column content

I have a part of a query which I am trying to optimize. My tables have a lot of info and it would be good for us to know if we could optimize it a little bit.
This part is the one that is taking longer:
CASE
WHEN {SPS_FACTURAS}.[TipoFactura] = 'F'
AND {SPS_FACTURAS}.[IsPurged] = 0 THEN (SELECT COUNT(DISTINCT NumRec)
FROM {SPS_LINFARMA} linfarma
WHERE linfarma.[IdCodFact] = {SPS_FACTURAS}.[IdCodFact])
WHEN {SPS_FACTURAS}.[TipoFactura] = 'F'
AND {SPS_FACTURAS}.[IsPurged] = 1 THEN (SELECT COUNT(DISTINCT NumRec)
FROM {SPS_HISTLINFARMA} histfarm
WHERE histfarm.[IdCodFact] = {SPS_FACTURAS}.[IdCodFact])
WHEN {SPS_FACTURAS}.[TipoFactura] = 'C'
AND {SPS_FACTURAS}.[NomUsrIns] <> 'WS_Faturacao'
AND {SPS_FACTURAS}.[IsPurged] = 0 THEN (SELECT COUNT(DISTINCT NUMFICH)
FROM {SPS_LINFACT2} linha
WHERE linha.[IdCodFact] = {SPS_FACTURAS}.[IdCodFact])
WHEN {SPS_FACTURAS}.[TipoFactura] = 'C'
AND {SPS_FACTURAS}.[NomUsrIns] <> 'WS_Faturacao'
AND {SPS_FACTURAS}.[IsPurged] = 1 THEN (SELECT COUNT(DISTINCT NUMFICH)
FROM {SPS_HISTLINFACT} histlin
WHERE histlin.[IdCodFact] = {SPS_FACTURAS}.[IdCodFact])
WHEN {SPS_FACTURAS}.[TipoFactura] = 'C'
AND {SPS_FACTURAS}.[NomUsrIns] = 'WS_Faturacao'
THEN (SELECT COUNT(DISTINCT NUMFICH)
FROM {SPS_LINFACTWS} linWS
WHERE linWS.[IdCodFact] = {SPS_FACTURAS}.[IdCodFact])
END
Is there a way I can optimize it?
Thanks a lot in advance,
Vincent Colpa

How to tune this query in Oracle 11g

I have been asked to tune the below query and would like to know if there is any better way to tune it?
SELECT req_dtl.lab_ord_occ_test_id ,
req_dtl.order_ref_no ,
req_dtl.accession_no ,
req_dtl.test_code ,
req_dtl.test_name ,
req_dtl.test_id ,
req_dtl.schedule_id ,
req_dtl.lab_ord_occ_id ,
req_dtl.order_type ,
lab_occ.facility_id ,
lab_occ.patient_id ,
lab_occ.order_draw_dt ,
hdr.source_system ,
(SELECT CORPORATION_ACRONYM
FROM corporation c,
facility f
WHERE c.corporation_id = f.corporation_id
AND f.facility_id = lab_occ.facility_id) AS corporation_acronym,
tst.container ,
lab_occ.order_duration_type ,
occ_test.mnc_yn
FROM ORDER_REQUISITION_HEADER hdr ,
ORDER_REQUISITION_DETAIL req_dtl ,
LAB_ORDER_OCC_TEST occ_test ,
LAB_ORDER_OCC lab_occ ,
TEST tst
WHERE hdr.requisition_hdr_id = in_requisition_hdr_id
AND hdr.msg_sent_to_lab_yn = 'Y'
AND req_dtl.requisition_hdr_id = hdr.requisition_hdr_id
AND occ_test.lab_order_occ_test_id = req_dtl.lab_ord_occ_test_id
AND req_dtl.test_id = tst.test_id
AND tst.accession_type NOT LIKE 'CMP%'
AND occ_test.status != 'R'
AND occ_test.lab_order_occ_id = lab_occ.lab_order_occ_id
AND lab_occ.status = 'A'
AND occ_test.created_dt >= hdr.msg_sent_to_lab_dt
AND NVL(occ_test.test_sent_to_lab_yn,'N') = 'N'
AND NOT EXISTS
(SELECT orddata.*
FROM MISSING_ORDER_DATA orddata,
TEST_CONFIG_HOLD_AOE tcha
WHERE orddata.test_id = tcha.test_id
AND tcha.active_yn = 'Y'
AND orddata.status_flag = 'A'
AND orddata.answer IS NULL
AND orddata.msg_sent_to_lab_yn = 'N'
AND orddata.lab_order_occ_test_id=occ_test.lab_order_occ_test_id
)
ORDER BY req_dtl.accession_no;
In the execution plan no tables are going for full table scan.Only nested loops are more.
*Suggest better way to tune this query *
AND NOT EXISTS
(SELECT orddata.*
FROM MISSING_ORDER_DATA orddata,
TEST_CONFIG_HOLD_AOE tcha
WHERE orddata.test_id = tcha.test_id
AND tcha.active_yn = 'Y'
AND orddata.status_flag = 'A'
AND orddata.answer IS NULL
AND orddata.msg_sent_to_lab_yn = 'N'
AND orddata.lab_order_occ_test_id=occ_test.lab_order_occ_test_id
)
could be moved to FROM
FROM
...
LEFT JOIN (SELECT DISTINCT orddata.lab_order_occ_test_id
FROM MISSING_ORDER_DATA orddata,
TEST_CONFIG_HOLD_AOE tcha
WHERE orddata.test_id = tcha.test_id
AND tcha.active_yn = 'Y'
AND orddata.status_flag = 'A'
AND orddata.answer IS NULL
AND orddata.msg_sent_to_lab_yn = 'N'
) missing ON missing.lab_order_occ_test_id = occ_test.lab_order_occ_test_id
WHERE missing.lab_order_occ_test_id IS NULL
Also you should move the acronym
FROM
...
INNER JOIN (SELECT CORPORATION_ACRONYM, f.facility_id
FROM corporation c,
facility f
WHERE c.corporation_id = f.corporation_id) acr ON
acr.facility_id = lab_occ.facility_id)
...Additionally, the TEST object must have an index on accession_type otherwise the tst.accession_type not like 'CMP%' clause will be slower than necessary.
Also, the clause: NVL(occ_test.test_sent_to_lab_yn,'N') = 'N' is essentially an outer join on partially-validated data. Does the test_sent_to_lab_yn column in occ_test contain nulls? If not, consider using an IN clause along with a valid list. [it looks like a yes/no column, maybe this should be equality on 'Y' and get someone to clean up the nulls?]
Please post the explain plan so we can suggest a re-ordering of the predicates in order to minimize the row-returns in first clause....and make HINT suggestions.

splitting select query into two or more parts

I am using TOAD for Oracle. While i implement some sql queries i encountered these problem:
I am using a few tables that each of them has approx. 10M rows for a select query. 2 tables have over 70M rows data.
Let's say i have;
a TRANSACTION table (prim. key: SQ_TRANSACTION_ID)
a TRANSACTION_DETAIL table (foreign keys: RF_TRANSACTION_ID,
RF_PRODUCT_ID)
a PRODUCT table (prim. key: SQ_PRODUCT_ID)
My select query is like;
SELECT TR.TRANSACTION_ID,
SUM(CASE WHEN PR.CD_PRODCUT_TYPE = 'A'
THEN TRD.CS_INVOICE_PRICE ELSE 0 END) A_PRODUCT_TOTAL,
SUM(CASE WHEN PR.CD_PRODCUT_TYPE <> 'A'
THEN TRD.CS_INVOICE_PRICE ELSE 0 END) B_PRODUCT_TOTAL
FROM TRANSACTION TR,
TRANSACTION_DETAIL TRD,
PRODUCT PR
WHERE TR.SQ_TRANSACTION_ID = TRD.RF_TRANSACTION_ID
AND TRD.RF_PRODUCT_ID = PR.SQ_PRODUCT_ID
GROUP BY TR.TRANSACTION_ID,
CASE WHEN PR.CD_PRODCUT_TYPE = 'A' THEN TRD.CS_INVOICE_PRICE ELSE 0 END,
CASE WHEN PR.CD_PRODCUT_TYPE <> 'A' THEN TRD.CS_INVOICE_PRICE ELSE 0 END
Is there a way to split this query into two or more parts with referenced each other by using their foreign/primary keys? I mean like splitting into two parts that first part fetches A_PRODUCT_TOTAL and second part fetches B_PRODUCT_TOTAL. Each part's transaction id should match at the result data.
A direct translation of your query would be:
SELECT TR.TRANSACTION_ID, SUM(TRD.CS_INVOICE_PRICE) A_PRODUCT_TOTAL
FROM TRANSACTION TR join
TRANSACTION_DETAIL TRD
on TR.SQ_TRANSACTION_ID = TRD.RF_TRANSACTION_ID join
PRODUCT PR
on TRD.RF_PRODUCT_ID = PR.SQ_PRODUCT_ID
WHERE PR.CD_PRODCUT_TYPE = 'A'
GROUP BY TR.TRANSACTION_ID,
CASE WHEN PR.CD_PRODCUT_TYPE = 'A' THEN TRD.CS_INVOICE_PRICE ELSE 0 END
However, I suspect that you don't want the second clause in the group by, because each transaction would be split into reows where the invoice price is the same:
SELECT TR.TRANSACTION_ID, SUM(TRD.CS_INVOICE_PRICE) A_PRODUCT_TOTAL
FROM TRANSACTION TR join
TRANSACTION_DETAIL TRD
on TR.SQ_TRANSACTION_ID = TRD.RF_TRANSACTION_ID join
PRODUCT PR
on TRD.RF_PRODUCT_ID = PR.SQ_PRODUCT_ID
WHERE PR.CD_PRODCUT_TYPE = 'A'
GROUP BY TR.TRANSACTION_ID;
The query for 'B' would be similar.