Oracle poor nested join performance - sql

I have a generic query builder that adds an arbitrary number of filters. I am getting poor performance on one of those filters (filter b) that requires going through two tables.
SELECT *
FROM (SELECT "TABLE_1".*
FROM "TABLE_1"
-- filter a: 1 table deep (fast)
inner join (SELECT "SHARED_ID"
FROM "TABLE_4"
WHERE "TABLE_4"."COLUMN_A" LIKE '%123%'
) "TABLE_4"
ON "TABLE_1"."SHARED_ID" = "TABLE_4"."SHARED_ID"
-- filter b: 2 tables deep (slow)
inner join (SELECT "SHARED_ID"
FROM "TABLE_2"
inner join (SELECT "ID"
FROM "TABLE_3"
WHERE NAME LIKE '%Abc%')
"TABLE_3"
ON "TABLE_2"."TABLE_3_ID" =
"TABLE_3"."ID") "TABLE_2"
ON "TABLE_1"."SHARED_ID" = "TABLE_2"."SHARED_ID")
WHERE ROWNUM <= 20

Related

SQL - How to put a condition for which table is selected without left join

I have a flag in a table which value ( 1 for US, or 2 for Global) indicates if the data will be in Table A or Table B.
A solution that works is to left join both tables; however this slows down significantly the scripts (from less than a second to over 15 seconds).
Is there any other clever way to do this? an equivalent of
join TableA only if TableCore.CountryFlag = "US"
join TableB only if TableCore.CountryFlag = "global"
Thanks a lot for the help.
You can try using this approach:
-- US data
SELECT
YourColumns
FROM
TableCore
INNER JOIN TableA AS T ON TableCore.JoinColumn = T.JoinColumn
WHERE
TableCore.CountryFlag = 'US'
UNION ALL
-- Non-US Data
SELECT
YourColumns -- These columns must match in number and datatype with previous SELECT
FROM
TableCore
INNER JOIN TableB AS T ON TableCore.JoinColumn = T.JoinColumn
WHERE
TableCore.CountryFlag = 'global'
However, if the result is still slow, you might want to check if the TableCore table has a index on CountryFlag and JoinColumn, and TableA and TableB an index on JoinColumn.
The basic structure is:
select . . ., coalesce(a.?, b.?) as ?
from tablecore c left join
tablea a
on c.? = a.? and c.countryflag = 'US' left join
tableb b
on c.? b.? and c.counryflag = 'global';
This version of the query can take advantage of indexes on tablea(?) and tableb(?).
If you have a complex query, this portion is probably not responsible for the performance problem.

Clustered and non-clustered index seeking increase execution time in stored procedure

I have a stored procedure which takes over 3 minutes to execute, when I show the execution plan I find
Clustered index seeking and non-clustered index seeking
index seeking
clustered index seek
My query:
SELECT distinct
[tbl_worflowprocess].[currenttid]
,USR2.[firstname] AS [prev_action_user_name]
,USR3.[firstname] AS [current_action_user_name]
,COD2.[Code] AS [reasontext]
,[tbl_application_details].[application_id] AS [ApplicationId]
,[tbl_application_details].[application_number] AS [ApplicationNumber]
,[dbo].[fn_app_GetApplicationId]([tbl_application_details].[link_application_id]) AS [LinkApplicationId]
,[tbl_application_details].[link_type] AS [LinkType]
,[dbo].[fn_app_CountProductsInApplication]([tbl_application_details].[application_id]) AS [ProductsCount]
,[tbl_application_details].[submission_date] AS [SubmissionDate]
,[tbl_jurisdiction].[jurisdictionname]
,[tbl_devicetype].[devicetype]
,COD1.[Code] AS [ClassificationName]
,EST1.[name] AS [ApplicantName]
,EST2.[name] AS [ManufacturerName]
,[dbo].[fnGetApplicationStatusFromTaskId]([tbl_worflowprocess].[currenttid]) AS [AppStatus]
,[dbo].[fnGetApplicationStatusText](#pLoggedInUserRoleId,[tbl_worflowprocess].[currenttid]) AS [StatusText]
,[Paid] = (CASE [tbl_application_details].[paid] WHEN 1 THEN 'Yes' ELSE 'No' END)
,[CreationDate] = [tbl_worflowprocess].[creationdate]
,[CommentText] =
(select CommentText
from dbo.tbl_application_comments
where Id = (select max(Id) from dbo.tbl_application_comments
where ApplicationId= [tbl_application_details].[application_id] and UserId = #pLoggedInUserID ))
,[LastCab] = (select isnull(dbo.fnGetLastCabForApplication([tbl_application_details].[application_id]),'-'))
,[tbl_application_details].[ArExpired]
FROM
[tbl_worflowprocess]
INNER JOIN
(SELECT
[application_id], [actionbyuser_id],
[actionbyrole_id], [reason_id], createddate
FROM
[tbl_applicationworkflowhistory]
INNER JOIN
(SELECT
[application_id] AS C1, MAX([version]) AS C2
FROM
[tbl_applicationworkflowhistory]
WHERE
(#pCurrentRoleId IS NULL
OR [application_id] IN (SELECT [application_id]
FROM [tbl_applicationworkflowhistory]
INNER JOIN [tbl_worflowprocess] ON [tbl_applicationworkflowhistory].[application_id] = [tbl_worflowprocess].[applicationid]
WHERE
(#pSearchInHistory = 0 OR [tbl_applicationworkflowhistory].[actionbyrole_id] = #pCurrentRoleId OR [tbl_worflowprocess].[currentroleid] = #pCurrentRoleId)
AND (#pSearchInHistory = 1 OR [tbl_worflowprocess].[currentroleid] = #pCurrentRoleId)
)
) AND
(#pCurrentUserId IS NULL OR [application_id] IN (
SELECT [application_id]
FROM [tbl_applicationworkflowhistory]
INNER JOIN [tbl_worflowprocess] ON [tbl_applicationworkflowhistory].[application_id]=[tbl_worflowprocess].[applicationid]
WHERE
(#pSearchInHistory=0 OR [tbl_applicationworkflowhistory].[actionbyuser_id] =#pCurrentUserId OR [tbl_worflowprocess].[currentuserid]=#pCurrentUserId)
AND (#pSearchInHistory=1 OR [tbl_worflowprocess].[currentuserid]=#pCurrentUserId)
)
) AND
(#pCurrentEstablishmentId IS NULL OR [application_id] IN (
SELECT [application_id]
FROM [tbl_applicationworkflowhistory]
INNER JOIN [tbl_worflowprocess] ON [tbl_applicationworkflowhistory].[application_id]=[tbl_worflowprocess].[applicationid]
WHERE
(#pSearchInHistory=0 OR [tbl_applicationworkflowhistory].[actionbyuser_id] IN
(SELECT [user_id] FROM [tbl_user] WHERE [establishment_id]=#pCurrentEstablishmentId) OR [tbl_worflowprocess].[currentuserid] IN
(SELECT [user_id] FROM [tbl_user] WHERE [establishment_id]=#pCurrentEstablishmentId))
AND (#pSearchInHistory=1 OR [tbl_worflowprocess].[currentuserid] IN
(SELECT [user_id] FROM [tbl_user] WHERE [establishment_id]=#pCurrentEstablishmentId))
)
)
GROUP BY [application_id]
)AS T1 ON ([tbl_applicationworkflowhistory].[application_id]=T1.C1 AND [tbl_applicationworkflowhistory].[version]=T1.C2)
) AS T2 ON([tbl_worflowprocess].[applicationid]=T2.[application_id])
INNER JOIN [tbl_application_details] ON [tbl_application_details].[application_id]=[tbl_worflowprocess].[applicationid]
INNER JOIN [tbl_user] USR1 ON USR1.[user_id]=[tbl_application_details].[responsible_user_id]
INNER JOIN [tbl_establishments] EST1 on EST1.[establishment_id] = USR1.[establishment_id]
LEFT OUTER JOIN [tbl_user] USR2 ON USR2.[user_id]=T2.[actionbyuser_id]
LEFT OUTER JOIN [tbl_user] USR3 ON USR3.[user_id]=[tbl_worflowprocess].[currentuserid]
LEFT OUTER JOIN [tbl_establishments] EST2 on EST2.[establishment_id] = [tbl_application_details].[manufacturer_id]
LEFT OUTER JOIN [tbl_jurisdiction] ON [tbl_jurisdiction].[jurisdiction_id]=[tbl_application_details].[jurisdiction_id]
LEFT OUTER JOIN [tbl_devicetype] ON [tbl_devicetype].[devicetype_id]=[tbl_application_details].[device_type_id]
LEFT OUTER JOIN [tbl_codes] COD1 ON COD1.[code_id]=[tbl_application_details].[device_classification_id]
LEFT OUTER JOIN [tbl_codes] COD2 ON COD2.[code_id]=T2.[reason_id]
LEFT OUTER JOIN [tbl_certificates] CERTF ON CERTF.[application_id]=[tbl_application_details].[application_id]
WHERE
(#pWFTasks IS NULL OR
[tbl_worflowprocess].[currenttid] IN (SELECT item
FROM [dbo].[fnSplit](#pWFTasks,',')))
Any way to improve my query?
try to create indexes on tables based on the query - use suggested performance improvements if exists and do not interfere with the rest of your DB.
if you have table scan in query execution plan, while index already exists on a table on that field - try to change the index to include the columns you select.
if you can - avoid using UDF's in case of many results returned by the query
if you can pre-calculate - do that using table variables or CTEs even: for example (if it returns more then 1 value) this can be stored in a table variable: SELECT [user_id] FROM [tbl_user] WHERE [establishment_id]=#pCurrentEstablishmentId)
queries, such as - "select max(Id) from dbo.tbl_application_comments" - can be improved by using simple variable before the query
use SNAPSHOT or READ UNCOMMITED or at least (nolock)
make sure statistics on tables are updated!
check you are using left join correctly (vs inner join which is faster)
limit the number of rows for each join as much as possible - use where statements to cut the data
more can be advise, query optimization is an interesting field

Alternatives to full outer join for logical OR in tree structure query

I hope the title is clear enough. I've been implementing logical AND/OR for tree structures which are kept in the database, using a simple nodes and a parent-child association table.
A sample tree has a structure like this:
A sample tree structure query is as follows:
The double lines in the query pattern mean that A has a child of type B (somewhere down its child nodes) OR C. I have implemented A -> HASCHILD -> C -> HASCHILD -> E with an inner join, and A -> HASCHILD -> B -> HASCHILD -> E is implemented like this.
The trick is joining these two branches on A. Since this is an OR operation, either B branch or C branch may not exist. The only method I could think of if to use full outer joins of two branches with A's node_id as the key. To avoid details, let me give this simplified snippet from my SQL query:
WITH A as (....),
B as (....),
C as (....),
......
SELECT *
from
A
INNER JOIN A_CONTAINS_B ON A.NODE_ID = A_CONTAINS_B.parent
INNER JOIN B ON A_CONTAINS_B.children #> ARRAY[B.NODE_ID]
INNER JOIN .....
full OUTER JOIN -- THIS IS WHERE TWO As ARE JOINED
(select
A2.NODE_ID AS A2_NODE_ID
from
A2
INNER JOIN A_CONTAINS_C ON A2.NODE_ID = C_CONTAINS_C.parent
INNER JOIN C ON A_CONTAINS_C.children #> ARRAY[C.NODE_ID]
INNER JOIN ....)
as other_branch
ON other_branch.A2_NODE_ID = A.NODE_ID
This query links two As which actually represent the same A using node_id, and if B or C does not exist, nothing breaks.
The result set has duplicates of course, but I can live with that. I can't however think of another way to implement OR in this context. ANDs are easy, they are inner joins, but left outer join is the only approach that lets me connect As. UNION ALL with dummy columns for both branches is not an option because I can't connect As in that case.
Do you have any alternatives to what I'm doing here?
UPDATE
TokenMacGuy's suggestion gives me a cleaner route than what I have at the moment. I should have remembered UNION.
Using the first approach he has suggested, I can apply a query pattern decomposition, which would be a consistent way of breaking down queries with logical operators. The following is a visual representation of what I'm going to do, just in case it helps someone else visualize the process:
This helps me do a lot of nice things, including creating a nice result set where query pattern components are linked to results.
I've deliberately avoided details of tables or other context, because my question is about how to join results of queries. How I handle the hierarchy in DB is a different topic which I'd like to avoid. I'll add more details into comments. This is basically an EAV table accomponied by a hierarchy table. Just in case someone would like to see it, here is the query I'm running without any simplifications, after following TokenMacGuy's suggestion:
WITH
COMPOSITION1 as (select comp1.* from temp_eav_table_global as comp1
WHERE
comp1.actualrmtypename = 'COMPOSITION'),
composition_contains_observation as (select * from parent_child_arr_based),
OBSERVATION as (select obs.* from temp_eav_table_global as obs
WHERE
obs.actualrmtypename = 'OBSERVATION'),
observation_cnt_element as (select * from parent_child_arr_based),
OBS_ELM as (select obs_elm.* from temp_eav_table_global as obs_elm
WHERE
obs_elm.actualrmtypename= 'ELEMENT'),
COMPOSITION2 as (select comp_node_tbl2.* from temp_eav_table_global as comp_node_tbl2
where
comp_node_tbl2.actualrmtypename = 'COMPOSITION'),
composition_contains_evaluation as (select * from parent_child_arr_based),
EVALUATION as (select eva_node_tbl.* from temp_eav_table_global as eva_node_tbl
where
eva_node_tbl.actualrmtypename = 'EVALUATION'),
eval_contains_element as (select * from parent_child_arr_based),
ELEMENT as (select el_node_tbl.* from temp_eav_table_global as el_node_tbl
where
el_node_tbl.actualrmtypename = 'ELEMENT')
select
'branch1' as branchid,
COMPOSITION1.featuremappingid as comprootid,
OBSERVATION.featuremappingid as obs_ftid,
OBSERVATION.actualrmtypename as obs_tn,
null as ev_ftid,
null as ev_tn,
OBS_ELM.featuremappingid as obs_elm_fid,
OBS_ELm.actualrmtypename as obs_elm_tn,
null as ev_el_ftid,
null as ev_el_tn
from
COMPOSITION1
INNER JOIN composition_contains_observation ON COMPOSITION1.featuremappingid = composition_contains_observation.parent
INNER JOIN OBSERVATION ON composition_contains_observation.children #> ARRAY[OBSERVATION.featuremappingid]
INNER JOIN observation_cnt_element on observation_cnt_element.parent = OBSERVATION.featuremappingid
INNER JOIN OBS_ELM ON observation_cnt_element.children #> ARRAY[obs_elm.featuremappingid]
UNION
SELECT
'branch2' as branchid,
COMPOSITION2.featuremappingid as comprootid,
null as obs_ftid,
null as obs_tn,
EVALUATION.featuremappingid as ev_ftid,
EVALUATION.actualrmtypename as ev_tn,
null as obs_elm_fid,
null as obs_elm_tn,
ELEMENT.featuremappingid as ev_el_ftid,
ELEMENT.actualrmtypename as ev_el_tn
from
COMPOSITION2
INNER JOIN composition_contains_evaluation ON COMPOSITION2.featuremappingid = composition_contains_evaluation.parent
INNER JOIN EVALUATION ON composition_contains_evaluation.children #> ARRAY[EVALUATION.featuremappingid]
INNER JOIN eval_contains_element ON EVALUATION.featuremappingid = eval_contains_element.parent
INNER JOIN ELEMENT on eval_contains_element.children #> ARRAY[ELEMENT.featuremappingid]
the relational equivalent to ∨ is &Union;. You could either use union to combine a JOIN b JOIN e with a JOIN c JOIN e or just use the union of b and c and join on the resulting, combined relation, something like a JOIN (b UNION c) JOIN e
More completely:
SELECT *
FROM a
JOIN (
SELECT
'B' source_relation,
parent,
b.child,
b_thing row_from_b,
NULL row_from_c
FROM a_contains_b JOIN b ON a_contains_b.child = b.node_id
UNION
SELECT
'C',
parent
c.child,
NULL,
c_thing
FROM a_contains_c JOIN c ON a_contains_c.child = c.node_id
) a_c ON A.NODE_ID = a_e.parent
JOIN e ON a_c.child = e.node_id;

Joining two tables on a key and then left outer joining a table on a number of criteria

I'm attempting to join 3 tables together in a single query. The first two have a key so each entry has a matching entry. This joined table will then be joined by a third table that could produce multiple entries for each entry from the first table (the joined ones).
select * from
(select a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession
from trade_monthly a, trade_monthly_second b
where
a.bidentifier = b.jidentifier AND
a.bsession = b.JSession)
left outer join
trade c
on c.symbol = a.symbol
order by a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.symbol
There will be more criteria (not just c.symbol = a.symbol) on the left outer join but for now this should be useful. How can I nest the queries this way? I'm gettin gan SQL command not properly ended error.
Any help is appreciated.
Thanks
For what I know every derived table must be given a name; so try something like this:
SELECT * FROM
(SELECT a.bidentifier, ....
...
a.bsession = b.JSession) t
LEFT JOIN trade c
ON c.symbol = t.symbol
ORDER BY t.bidentifier, ...
Anyway I think you could use a simpler query:
SELECT a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.*
FROM trade_monthly a
INNER JOIN trade_monthly_second b
ON a.bidentifier = b.jidentifier
AND a.bsession = b.JSession
LEFT JOIN trade c
ON c.symbol = a.symbol
ORDER BY a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.symbol
Try this:
SELECT
`trade_monthly`.`bidentifier` AS `bidentifier`,
`trade_monthly`.`bsession` AS `bsession`,
`trade_monthly`.`symbol` AS `symbol`,
`trade_monthly_second`.`jidentifier` AS `jidentifier`,
`trade_monthly_second`.`jsession` AS `jsession`
FROM
(
(
`trade_monthly`
JOIN `trade_monthly_second` ON(
(
(
`trade_monthly`.`bidentifier` = `trade_monthly_second`.`jidentifier`
)
AND(
`trade_monthly`.`bsession` = `trade_monthly_second`.`jsession`
)
)
)
)
JOIN `trade` ON(
(
`trade`.`symbol` = `trade_monthly`.`symbol`
)
)
)
ORDER BY
`trade_monthly`.`bidentifier`,
`trade_monthly`.`bsession`,
`trade_monthly`.`symbol`,
`trade_monthly_second`.`jidentifier`,
`trade_monthly_second`.`jsession`,
`trade`.`symbol`
Why don't you just create a view of the two inner joined tables. Then you can build a query that joins this view to the trade table using the left outer join matching criteria.
In my opinion, views are one of the most overlooked solutions to a lot of complex queries.

Replace IN with EXISTS or COUNT. How to do it. What is missing here?

I am using IN keyword in the query in the middle of a section. Since I am using nested query and want to replace In with Exists due to performance issues that my seniors have told me might arise.
Am I missing some column, what you are looking for in this query. This query contain some aliases for readibility.
How can I remove it.
SELECT TX.PK_MAP_ID AS MAP_ID
, MG.PK_GUEST_ID AS Guest_Id
, MG.FIRST_NAME
, H.PK_CATEGORY_ID AS Preference_Id
, H.DESCRIPTION AS Preference_Name
, H.FK_CATEGORY_ID AS Parent_Id
, H.IMMEDIATE_PARENT AS Parent_Name
, H.Department_ID
, H.Department_Name
, H.ID_PATH, H.DESC_PATH
FROM
dbo.M_GUEST AS MG
LEFT OUTER JOIN
dbo.TX_MAP_GUEST_PREFERENCE AS TX
ON
(MG.PK_GUEST_ID = TX.FK_GUEST_ID)
LEFT OUTER JOIN
dbo.GetHierarchy_Table AS H
ON
(TX.FK_CATEGORY_ID = H.PK_CATEGORY_ID)
WHERE
(MG.IS_ACTIVE = 1)
AND
(TX.IS_ACTIVE = 1)
AND
(H.Department_ID IN -----How to remove this IN operator with EXISTS or Count()
(
SELECT C.PK_CATEGORY_ID AS DepartmentId
FROM
dbo.TX_MAP_DEPARTMENT_OPERATOR AS D
INNER JOIN
dbo.M_OPERATOR AS M
ON
(D.FK_OPERATOR_ID = M.PK_OPERATOR_ID)
AND
(D.IS_ACTIVE = M.IS_ACTIVE)
INNER JOIN
dbo.L_USER_ROLE AS R
ON
(M.FK_ROLE_ID = R.PK_ROLE_ID)
AND
(M.IS_ACTIVE = R.IS_ACTIVE)
INNER JOIN
dbo.L_CATEGORY_TYPE AS C
ON
(D.FK_DEPARTMENT_ID = C.PK_CATEGORY_ID)
AND
(D.IS_ACTIVE = C.IS_ACTIVE)
WHERE
(D.IS_ACTIVE = 1)
AND
(M.IS_ACTIVE = 1)
AND
(R.IS_ACTIVE = 1)
AND
(C.IS_ACTIVE = 1)
)--END INNER QUERY
)--END Condition
What new problems might I get if I replace IN with EXISTS or COUNT ?
Basically, as I understand your question, you are asking how can I replace this:
where H.department_id in (select departmentid from...)
with this:
where exists (select...)
or this:
where (select count(*) from ...) > 1
It is fairly straight forward. One method might be this:
WHERE...
AND EXISTS (select c.pk_category_id
from tx_map_department_operator d
inner join m_operator as m
on d.fk_operator_id = m.pk_operator_id
inner join l_user_role l
on m.fk_role_id = r.pk_role_id
inner join l_category_type c
on d.fk_department_id = c.pk_category_id
where h.department_id = c.pk_category_id
and d.is_active = 1
and m.is_active = 1
and r.is_active = 1
and c.is_active = 1
)
I removed the extra joins on is_active because they were redundant. You should test how it runs with your indexes, because that might have been faster. I doubt it though. But it is worth comparing whether it is faster to add the join clause (join on ... and x.is_active=y.is_active) or to check in the where clause (x.is_active=1 and y.is_active=1 and z.is_active=1...)
And I'd recommend you just use exists, instead of count(*), because I know that exists should stop after finding 1 row, whereas count probably continues to execute until done, and then compares to your reference value (count > 1).
As an aside, that is a strange column naming standard you have. Do you really have PK prefixes for the primary keys, and FK prefixes for the foreign keys? I have never seen that.