Help for SQL tuning - ORACLE - sql

I have a query which take data from 5 huge table, could you please help me with performance tuning of this query :
SELECT DECODE(SIGN((t1.amount - NVL(t2.amount, 0)) - 4.999), 1, NVL(t2.amount, 0), t1.amount) AS amount_1,
t1.element_id,
t1.start_date ,
t1.amount,
NVL(t5.abrev, NULL) AS criteria,
t1.case_id ,
NVL(t5.value, NULL) segment,
add_months(t1.start_date, -1) invoice_date,
NVL((SELECT SUM(b.amount)
FROM TABLE1 a, TABLE3 b
WHERE a.element_id = b.element_id
AND b.date_invoicing < a.start_date
AND t1.element_id = a.element_id),
0) amount_2
FROM TABLE1 t1, TABLE2 t2, TABLE3 t3, TABLE4 t4, TABLE5 t5
WHERE t1.TYPE = 'INVOICE'
AND t2.case_id = t3.case_id
AND t2.invoicing_id = t3.invoicing_id
AND t2.date_unpaid IS NULL
AND t1.element_id = t3.element_id(+)
AND add_months(t1.start_date, -1) <
NVL(t4.DT_FIN_DT(+), SYSDATE)
AND add_months(t1.start_date, -1) >= t4.date_creation(+)
AND t1.case_id = t4.case_id(+)
AND t4.segment = t5.abrev(+)
AND t5.Type(+) = 'CRITERIA_TYPE';
is there something wrong and could be replaced with something else?
Thanks for your help

The first thing you must do is to use Explicit Joins. This will separate your joins from your filters and will help you tune this better.
Please check if these joins are correct.
SELECT
DECODE(SIGN((t1.amount - NVL(t2.amount, 0)) - 4.999), 1, NVL(t2.amount, 0), t1.amount) AS amount_1,
t1.element_id,
t1.start_date ,
t1.amount,
NVL(t5.abrev, NULL) AS criteria,
t1.case_id ,
NVL(t5.value, NULL) segment,
add_months(t1.start_date, -1) invoice_date,
NVL
(
(SELECT SUM(b.amount)
FROM TABLE1 a, TABLE3 b
WHERE a.element_id = b.element_id
AND b.date_invoicing < a.start_date
AND t1.element_id = a.element_id),
0) amount_2
FROM
TABLE1 t1
LEFT OUTER JOIN TABLE3 t3
on t1.element_id = t3.element_id
INNER JOIN TABLE2 t2,
on t2.invoicing_id = t3.invoicing_id
and t2.case_id = t3.case_id
LEFT OUTER JOIN TABLE4 t4
on t1.case_id = t4.case_id
LEFT OUTER JOIN TABLE5 t5
on t4.segment = t5.abrev
WHERE t1.TYPE = 'INVOICE'
AND t2.date_unpaid IS NULL
AND add_months(t1.start_date, -1) < NVL(t4.DT_FIN_DT(+), SYSDATE)
AND add_months(t1.start_date, -1) >= t4.date_creation(+)
AND t5.Type(+) = 'CRITERIA_TYPE';
If they are, then you can do several things, but the best thing is to look at the execution plan.

As others have noted, it's hard to tell without looking at the execution plan.
But... some things I'd be concerned with:
The outer join to TABLE3 in the main query isn't complete as #TonyAndrews mentioned in his comment above. See the "Incomplete Join Trail" example on Common errors seen when using OUTER-JOIN. This means your query is probably producing the wrong results, but without knowing the full intent of the query and the schema, no one but you could know this for sure.
Updating your query to use the ANSI-style INNER/[LEFT|RIGHT] OUTER syntax from the Oracle-style TableName.ColumnName(+) will help make this more apparent.
The scalar subquery will get run for every row and may be slow (assuming TABLE3 is large). It will be extremely slow if there's not a useful index on TABLE3.element_id and TABLE3.date_invoicing:
NVL((SELECT SUM(b.amount)
FROM TABLE1 a, TABLE3 b
WHERE a.element_id = b.element_id
AND b.date_invoicing < a.start_date
AND t1.element_id = a.element_id),
0) amount_2
As such, I'm not seeing a need to include TABLE1 again in this subquery. It may be better to refactor this into:
NVL((SELECT SUM(b.amount)
FROM TABLE3 b
WHERE t1.element_id = b.element_id
AND b.date_invoicing < t1.start_date,
0) amount_2
Or, you may even be better off refactoring this to use an analytical function (SO question, Oracle documentation) if the criteria for summing the b.amount values is the same as that for including them in the query in the first place:
SUM(b.amount) OVER (PARTITION BY b.element_id) amount_2
Obviously, you currently have different criteria for summing b.amount since you're joining to TABLE3 differently in the main query and the subquery, but I'd imagine that's more a factor of the "Incomplete Join Trail" than by purposeful design (a guess on my part, as I can't tell the intent of the query from the code itself).

The optimizer may have produced a suboptimal execution plan. Or it may very well be running as fast as possible given the amount of work the database actually need to do.
Without explain plan, knowing the keys, relations and indexes it is a bit hard to tell what is going on.
Scalar subqueries in the select list is usually not a good idea when the outer query returns a large nr of rows.
The following expressions may prevent the optimizer from using the statistics because of the function calls. Indexes would probably not be used either for the same reason.
AND add_months(t1.start_date, -1) < NVL(t4.DT_FIN_DT(+), SYSDATE)
AND add_months(t1.start_date, -1) >= t4.date_creation(+)
Can't really be more specific than that :)

You need to learn about how to view and understand execution plans. This previous question is a good place to start.

It's abit weird here when you nest Select statement inside another
NVL((SELECT SUM(b.amount)
FROM TABLE1 a, TABLE3 b
WHERE a.element_id = b.element_id
AND b.date_invoicing < a.start_date
AND t1.element_id = a.element_id),
0) amount_2
you need to write again as a table and join after "From".

Related

Oracle join optimization

I have the following SQL query:
select dres.colA,
dres.colP,
dres.ID,
dre.ID,
dre.colED,
dre.VID,
vpp.VID,
vpp.colDESC
from table1 dres
left join table2 dre on dres.ID = dre.ID
left join table3 vpp on vpp.VID = dre.VID
where dre.START_TIME >= date '2017-01-01';
Do you have any suggestion how the query can work better (or should look)?
Something like:
...where dres.ID in (select * from table2
where VID in (select * from table3))....
First, your where clause changes the outer joins to inner joins. So, start by writing the query as:
select dres.colA, dres.colP, dres.ID,
dre.ID, dre.colED, dre.VID,
vpp.VID, vpp.colDESC
from table1 dres join
table2 dre
on dres.ID = dre.ID join
table3 vpp
on vpp.VID = dre.VID
where dre.START_TIME >= date '2017-01-01';
The place to start is with indexes on table2(id, vid, start_time, colED) and table3(vid, colDESC).
It is possible that an alternative indexing strategy would work: table2(start_time, id, vid, colED). This would allow the where clause to use the index. But that particular where clause may not be highly selective.

SQL Simple Query From Multiple Tables

I would appreciate if anyone can give me a hand with this question
I have 4 SQL tables. Open, High, Low and Close.
Each have 2 columns called [Date],[Price].
The dates are the same - but Price is a number and is different.
How can we make a query where the results are as follows
[Date],[Open.Price],[High.Price],[Low.Price],[Close.Price]
SELECT Open_table.date,Open_table.Price,High_table.Price,low_table.Price,
Close_table.Price
FROM Open_table
JOIN High_table ON Open_table.date = High_table.date
JOIN low_table ON Open_table.date = low_table.date
JOIN Close_table ON Open_table.date = Close_table.date
You could try joining on date:
SELECT t1.[Date],
t1.[Price] AS [Open.Price],
t2.[Price] AS [High.Price],
t3.[Price] AS [Low.Price],
t4.[Price] AS [Close.Price]
FROM Open t1
INNER JOIN High t2
ON t1.[Date] = t2.[Date]
INNER JOIN Low t3
ON t2.[Date] = t3.[Date]
INNER JOIN Close t4
ON t3.[Date] = t4.[Date]
I found another way of doing so after I posted this question
SELECT
EuropeOpen.[Date],EuropeOpen.[OCDO LN],EuropeHigh.[Date],EuropeHigh.[OCDO LN],EuropeLow.[Date],EuropeLow.[OCDO LN],
EuropeClose.[Date],EuropeClose.[OCDO LN]
FROM EuropeOpen,EuropeHigh,EuropeLow,EuropeClose
// In case you need any conditions
WHERE....

Improving performance: Very Slow Oracle SQL Join

I am a newbie in SQL querying and I am spending 3hrs to get the whole result of joining 2 queries.
I have focused on using left joins and avoided using subqueries on the select statement after researching. However it is still extremely slow. I have no close friends who know sql enough to explain whats wrong or what I approach I should take.
I am also new here so if this question is not allowed please inform me and I will remove it immediately.
This is the structure of the query...
The first query will get the member details.
The second query will get the transaction details.
The relationship is,
one product has many sub-plans which has many members.
One product also has many transactions which is made on a per product basis.
I am required to show all transactions and duplicate each line for each member.
I joined the queries using the product primary key.
Prior to joining, I have tested both individual queries and they turned out fine. Only 1-2 secs and I get the result.
But joining the two, I end up with 3 hrs of waiting.
SELECT
MPPFF.N_DX,
MPPFF.PM_A_P,
MPPFF.FEE1,
MPPFF.FEE2,
MPPFF.FEE3,
MPPFF.FEE4,
MPPFF.FEE11,
MPPFF.FEE12,
MPPFF.FEE5,
MPPFF.N_NO,
MPPFF.SETN_DX,
MPPFF.PRIME_NO,
MPPFF.SECN_NO,
MPPFF.COMM_A,
MPPFF.TYX_NO,
MPPFF.P_NAME,
MPPFF.B_BFX,
MPPFF.B_FM,
MPPFF.B_TO,
MPPFF.BB_NAME_P,
MPPFF.BB_NAME_S,
MPPFF.REVERSE_BFX,
MPPFF.TYX_REF_NO,
MPPFF.BB_NO_AX,
MPPFF.BB_NAME_AX,
MPPFF.DXC,
MPPFF.ST,
MPPFF.DAY,
MPPFF.CE_D_PRODUCT,
MPPFF.CE_H,
MPPFF.AS_C_E,
MPPFF.BCH,
MPPFF.RCPY_NO,
MPPFF.RE_BFX,
MPPFF.A_END,
MPPFF.PLACE,
MPPFF.MEMB_DX,
MPPFF.MBR_NO,
MPPFF.MBR_TR_BFX,
MPPFF.CE_D_TERM_CE,
MPPFF.MEMBER_AS,
MPPFF.C_USER,
MPPFF.C_BFX,
MPPFF.U_USER,
MPPFF.U_BFX
FROM (
SELECT
FF.N_DX,
FF.PM_A_P,
FF.FEE1,
FF.FEE2,
FF.FEE3,
FF.FEE4,
FF.FEE11,
FF.FEE12,
FF.FEE5,
FF.N_NO,
FF.SETN_DX,
FF.PRIME_NO,
FF.SECN_NO,
FF.COMM_A,
FF.TYX_NO,
FF.P_NAME,
FF.B_BFX,
FF.B_FM,
FF.B_TO,
FF.BB_NAME_P,
FF.BB_NAME_S,
FF.REVERSE_BFX,
FF.TYX_REF_NO,
FF.BB_NO_AX,
FF.BB_NAME_AX,
FF.DXC,
FF.ST,
FF.DAY,
FF.CE_D_PRODUCT,
FF.CE_H,
FF.AS_C_E,
FF.RCPY_NO,
FF.RE_BFX,
FF.A_END,
FF.BCH,
MPP.MBR_NO,
MPP.MBR_TR_BFX,
MPP.CE_D_TERM_CE,
MPP.C_USER,
MPP.C_BFX,
MPP.U_USER,
MPP.U_BFX,
MPP.PLACE,
MPP.MEMBER_AS,
MPP.TYX_DX,
MPP.AS_DX,
MPP.PRODUCT,
MPP.POPL_DX,
MPP.MEMB_DX,
FF.TYX_DX
FROM (
SELECT
MBR.MEMB_DX,
MBR.MBR_NO,
MBR.MBR_TR_BFX,
MBR.CE_D_TERM_CE,
MBR.C_USER,
MBR.C_BFX,
MBR.U_USER,
MBR.U_BFX,
MPP.PLACE,
MPP.MEMBER_AS,
MPP.TYX_DX,
MPP.AS_DX,
MPP.PRODUCT,
MPP.POPL_DX
FROM (
SELECT
MPP.PLACE,
MPP.MEMBER_AS,
MPP.TYX_DX,
MPP.AS_DX,
MPP.PRODUCT,
MPP.POPL_DX,
MMP.MEMB_DX
FROM(
SELECT
MPP.PLACE,
MPP.TYX_AS_DXC MEMBER_AS,
MPP.TYX_DX,
MPP.AS_DX,
MPP.POPL_DX,
RPT.PRODUCT
FROM
TABLE1 MPP
LEFT JOIN (
SELECT
SUBSTR(CE_D_PRODUCT,9) PRODUCT,
AS_DX
FROM
TABLE6 RPT,
TABLE7 PP
WHERE
PP.PRTY_DX = RPT.PRTY_DX
) RPT
ON MPP.AS_DX = RPT.AS_DX
) MPP
LEFT JOIN (
SELECT
POPL_DX,
MEMB_DX
FROM
TABLE4
)MMP
ON MPP.POPL_DX=MMP.POPL_DX
) MPP,
(
SELECT
MBR.MEMB_DX,
MBR.MBR_NO,
MBR.TERM_BFX MBR_TR_BFX,
MBR.CE_D_TERM_CE,
MBR.C_USER,
MBR.C_BFX,
MBR.U_USER,
MBR.U_BFX
FROM
TABLE8 MBR
) MBR
WHERE
MPP.MEMB_DX = MBR.MEMB_DX
) MPP
INNER JOIN
(
SELECT
FF.N_DX,
ROUND(CB.FEE5 * FF.RATE,2) PM_A_P,
CB.FEE1,
CB.FEE2,
CB.FEE3,
CB.FEE4,
CB.FEE11,
CB.FEE12,
CB.FEE5,
FF.N_NO,
FF.SETN_DX,
FF.PRIME_NO,
FF.SECN_NO,
FF.COMM_A,
FF.TYX_NO,
FF.P_NAME_1||', '||FF.P_NAME_2||' '||FF.P_NAME_3 P_NAME,
FF.B_BFX,
FF.B_FM,
FF.B_TO,
FF.BB_NAME_1_P||', '||FF.BB_NAME_2_P BB_NAME_P,
FF.BB_NAME_1_S||', '||FF.BB_NAME_2_S BB_NAME_S,
CB.REVERSE_BFX,
FF.TYX_REF_NO,
FF.BB_NO_AX,
FF.BB_NAME_1_AX||' '|| FF.BB_NAME_2_AX BB_NAME_AX,
CASE
WHEN FF.CE_D_ST IN ('A', 'B', 'C') THEN 'AC'
WHEN FF.DAY >1 THEN 'NEW'
ELSE 'AB'
END DXC,
FF.CE_D_ST ST,
FF.DAY,
FF.CE_D_PRODUCT,
FF.CE_D_COMP CE_H,
FF.AS_C AS_C_E,
FF.RCPY_NO,
FF.RE_BFX,
ROUND(CB.A_S,2) A_END,
FF.TYX_DX,
MP.BCH
FROM
TABLE2 CB,
TABLE3 FF
LEFT JOIN (
SELECT
SUBSTR(CE_D_BCH_O,13) BCH,
TYX_DX
FROM
TABLE5 MP
)MP
ON MP.TYX_DX = FF.TYX_DX
WHERE
FF.SETN_DX = CB.SETN_DX AND
EXTRACT( YEAR FROM FF.EFF_BFX) >=2013
) FF
ON MPP.TYX_DX = FF.TYX_DX
)MPPFF
;
Use ROWNUM to prevent optimizer transformations from degrading the performance.
You are encountering a common problem - two queries run fast separately but run slow when put together. Oracle does not have to run the queries in the order they are written. It can merge views, push predicates around, and generally completely re-write the query to run in a different order. Normally this is a great thing because you don't want to have to worry about which physical order to join tables. But sometimes Oracle applies the wrong transformations and the results are disastrous.
There are two ways to solve these problems.
Look at table structures, the statements, the execution plans, SQL monitoring or traces, statistics, etc. Try to find out which operation is slow, and why (use cardinality as your guide), and then try to fix it. This process can easily take hours, maybe even days, but it's the best way to learn.
Stop the optimizer from combining the queries with a simple trick. There are a few ways to do this but in my experience the simplest way is to add the pseudo-column ROWNUM to any inline view that you do not want transformed. ROWNUM is a special column that tells Oracle "this query block must be returned in a specific way, don't do anything to it".
Change this:
--This is slow:
select ...
from
(
--This is fast:
select ...
) inline_view1
join
(
--This is fast:
select ...
) inline_view2
on ...
to this:
--Now this is fast.
select ...
from
(
--This is fast:
select rownum /*add rownum to prevent slow transformations*/, ...
) inline_view1
join
(
--This is fast:
select rownum /*add rownum to prevent slow transformations*/, ...
) inline_view2
on ...
In your code I believe the two inline views to modify would be the outer-most MPP and FF.
On a side note, I disagree with with some of the other comments and answers.
A CTE will not help here since none of the tables are used twice.
You don't always need to know a million details about the query to tune it. Unless you have the time and want to improve your skills.
I think your over-all query structure is good. You are on the right path to building great SQL statements. Inline views are the key to writing SQL - build small units of code, combine them in simple steps, repeat. Putting all the tables together in one massive join is a recipe for spaghetti code. Although I agree with others that you should avoid the old-fashioned join syntax. And the query would really benefit from some comments and more meaningful names. And don't feel afraid to put all the select list items on one line. Having a 500-column line isn't ideal, but you want to focus on the joins, not the simple list of columns.
Your query is almost unreadable, because of all the nesting. And you are mixing pre 1992 style joins with current join syntax. Don't use the outdated comma-separated join syntax. It is prone to errors. All your outer-joins are void, because at some point you will always have criteria that dismisses outer-joined records, such as when inner-joining table8 on the outer-joined table4's memb_dx.
Your query seems to translate to
select
<several fields from the tables>
from table1 mpp
join table6 rpt on rpt.as_dx = mpp.as_dx
join table7 pp on pp.prty_dx = rpt.prty_dx
join table4 mmp on mmp.popl_dx = mpp.popl_dx
join table8 mbr on mpp.memb_dx = mmp.memb_dx
join table3 ff on ff.tyx_dx = mpp.tyx_dx and extract(year from ff.eff_bfx) >= 2013
join table2 cb on ff.setn_dx = cb.setn_dx
left join table5 mp on mp.tyx_dx = ff.tyx_dx;
and maybe you want it to be
select
<several fields from the tables>
from table1 mpp
left join table6 rpt on rpt.as_dx = mpp.as_dx
left join table7 pp on pp.prty_dx = rpt.prty_dx
left join table4 mmp on mmp.popl_dx = mpp.popl_dx
left join table8 mbr on mpp.memb_dx = mmp.memb_dx
join table3 ff on ff.tyx_dx = mpp.tyx_dx and extract(year from ff.eff_bfx) >= 2013
join table2 cb on ff.setn_dx = cb.setn_dx
left join table5 mp on mp.tyx_dx = ff.tyx_dx;
instead or something along the lines. Get rid of all the nesting and stay with a clear and easy to read from clause.
One thing others haven't mentioned is the use of
EXTRACT( YEAR FROM FF.EFF_BFX) >=2013
This applies the EXTRACT function to every row selected from TABLE3 (I believe that's what FF refers to at this point in the query). I suggest replacing the above with
FF.EFF_BFX >= TO_DATE('01-JAN-2013', 'DD-MON-YYYY')
or something similar. This requires only a single call to TO_DATE to generate the date constant, which is then compared directly to FF.EFF_BFX, which appears to be a column of type DATE.
This query also uses the same table alias (e.g. FF, MPP, etc) multiple times for different entities in different contexts. In my opinion this is bad practice, and I suggest you rework your query to use a unique alias for each entity, which will make the query easier to understand.
As others have mentioned, getting rid of the pre-1992 joins in the WHERE clause would also help clarify what's going on, as would getting rid of the long column lists. A couple of the subqueries could be eliminated as well which would make the query cleaner and clearer.
After dealing with all the above I get the following:
SELECT *
FROM (SELECT *
FROM TABLE1 MPP
LEFT OUTER JOIN (SELECT SUBSTR(CE_D_PRODUCT, 9) PRODUCT,
AS_DX
FROM TABLE6 RPT
INNER JOIN TABLE7 PP
ON PP.PRTY_DX = RPT.PRTY_DX) RPT
ON MPP.AS_DX = RPT.AS_DX
LEFT OUTER JOIN TABLE4 MMP
ON MPP.POPL_DX = MMP.POPL_DX) MPP
INNER JOIN TABLE8 MBR
ON MPP.MEMB_DX = MBR.MEMB_DX
INNER JOIN (SELECT FF.*,
CB.*,
ROUND(CB.FEE5 * FF.RATE,2) PM_A_P,
FF.P_NAME_1 || ', ' || FF.P_NAME_2 || ' ' || FF.P_NAME_3 P_NAME,
FF.BB_NAME_1_P || ', ' || FF.BB_NAME_2_P BB_NAME_P,
FF.BB_NAME_1_S || ', ' || FF.BB_NAME_2_S BB_NAME_S,
FF.BB_NAME_1_AX || ' ' || FF.BB_NAME_2_AX BB_NAME_AX,
CASE
WHEN FF.CE_D_ST IN ('A', 'B', 'C') THEN 'AC'
WHEN FF.DAY > 1 THEN 'NEW'
ELSE 'AB'
END DXC,
ROUND(CB.A_S,2) A_END,
SUBSTR(MP.CE_D_BCH_O, 13) AS BCH
FROM TABLE2 CB
INNER JOIN TABLE3 FF
ON FF.SETN_DX = CB.SETN_DX
LEFT OUTER JOIN TABLE5 MP
ON MP.TYX_DX = FF.TYX_DX
WHERE FF.EFF_BFX >= TO_DATE('01-JAN-2013', 'DD-MON-YYYY')) FF
ON MPP.TYX_DX = FF.TYX_DX
Best of luck.
I tried to make your query more readable:
SELECT MPPFF.*
FROM
(SELECT FF.*, MPP.*
FROM
(SELECT MBR.*, MPP.*
FROM
(SELECT MPP.*, MMP.*
FROM
(SELECT MPP.*, RPT.*
FROM TABLE1 MPP
LEFT JOIN (SELECT * FROM TABLE6 RPT, TABLE7 PP WHERE PP.PRTY_DX = RPT.PRTY_DX) RPT ON MPP.AS_DX = RPT.AS_DX) MPP
LEFT JOIN (SELECT * FROM TABLE4) MMP ON MPP.POPL_DX=MMP.POPL_DX) MPP,
(SELECT MBR.* FROM TABLE8 MBR) MBR
WHERE MPP.MEMB_DX = MBR.MEMB_DX) MPP
INNER JOIN (SELECT FF.*, CB.* FROM TABLE2 CB, TABLE3 FF
LEFT JOIN (SELECT * FROM TABLE5 MP ) MP ON MP.TYX_DX = FF.TYX_DX
WHERE FF.SETN_DX = CB.SETN_DX
AND EXTRACT( YEAR FROM FF.EFF_BFX) >=2013) FF ON MPP.TYX_DX = FF.TYX_DX) MPPFF
;
You select 8 different tables and the only WHERE condition is EXTRACT( YEAR FROM FF.EFF_BFX) >= 2013
Unless the tables are tiny it will always take some time to query them all together.
Why do you mix ANSI join syntax and old-style Oracle join syntax?

JOIn by case Expression

I would like to perform this Code
select * from a
right join s
on case when s.[Diff ] = 0 and a.ActivityDate < s.[ExecDate]
then a.ID1 =s.ID2
when
( a.ActivityDate <s.[ExecDate] and a.ActivityDate >= s.[Date3] )
then a.ID1 =s.ID2
END
The case is pointless. You join the same two fields ANYWAYS, so just add your case conditions to the join condition:
SELECT ...
JOIN ... ON ((a.ID1 = s.ID2) AND ((case #1) OR (case #2)))
Just to elaborate on Marc's answer, I think the simplest form is:
select *
from a right join
s
on a.ID1 = s.ID2 and a.ActivityDate < s.[ExecDate] and
(s.[Diff ] = 0 or a.ActivityDate >= s.[Date3])
Note that I do advise using left join instead of right join. It is usually more intuitive to read a query thinking "all the rows in the first table are kept as well a matching rows in other tables."

How to perform CASE on values from subquery

This is going to be difficult to explain, but here goes.
I am looking to perform a CASE condition in a SELECT clause that will use the results of two calculations to determine which calculation value to return for a column value.
Maybe a code sample will help:
this works:
SELECT
A.[COLUMN1]
, B.[COLUMN1]
, CASE
WHEN A.[COLUMN2] + A.[COLUMN3] >= B.[COLUMN2] + B.[COLUMN3] THEN A.[COLUMN2] + A.[COLUMN3]
ELSE B.[COLUMN2] + B.[COLUMN3]
FROM
[TABLE_A] A
INNER JOIN [TABLE_B] B INNER JOIN ON A.ID = B.ID
The problem here is that the query above, in the case statement, is forced to perform the calculation twice. Once for the WHEN clause and again for the THEN clause.
I want to do something like this, but SQL is not happy with it.
SELECT
A.[COLUMN1]
, B.[COLUMN1]
, CASE
WHEN AB.X >= AB.Y THEN AB.X
ELSE AB.Y
END
FROM ((A.[COLUMN2] + A.[COLUMN3]) X, (B.[COLUMN2] + B.[COLUMN3]) Y)
FROM
[TABLE_A] A
INNER JOIN [TABLE_B] B INNER JOIN ON A.ID = B.ID
Is this even possible? In the second example, I am calculating the values only once and referring to them in the case statement, both for the WHEN and the THEN clauses.
I would much prefer to push the calculations down into each table. This keeps the structure of the query quite similar. So, a syntactically correct (or almost correct) version would be:
SELECT A.[COLUMN1], B.[COLUMN1],
(CASE WHEN a.col_2_3 >= b.col_2_3 THEN a.col_2_3
ELSE b.col_2_3
end)
FROM (select a.*, (A.[COLUMN2] + A.[COLUMN3]) as col_2_3
from [TABLE_A] a
) a INNER JOIN
(select b.*, (B.[COLUMN2] + B.[COLUMN3]) as col_2_3
from [TABLE_B] b
)b
ON a.ID = b.ID
There are so many important factors in performance, and overhead for simple calculations is just not one of them. Reading the data and the join are way, way more expensive than simple calculations.
However, moving variables into subqueries is useful for a few reasons. First, the calculations could be more expensive (using subqueries, say). It also helps with readability and hence maintainability.
Finally, a SQL engine could decide to evaluate those expressions just once. In practice, I'm guessing that none make that trivial optimization.
You could reformulate it as this
SELECT a_column1,
b_column1,
CASE
WHEN x >= y THEN x
ELSE y
END AS foo
FROM (SELECT A.[column1] A_COLUMN1,
B.[column1] B_COLUMN1,
( A.[column2] + A.[column3] ) X,
( B.[column2] + B.[column3] ) Y
FROM [table_a] A
INNER JOIN [table_b] B
ON A.id = B.id)t
But I'm not sure it will make a difference since the operations may be performed once per row anyway