Is there any way to optimize this select - sql

So i'm having this big select witch used with union. I'm thinking, is there any way to optimize it, because it's now quite heavy.
As you can see the main difference is between joined tables (srv_obj_intermediate and srv_obj_attributes) and they have 2 different oet.code
SELECT * FROM (SELECT soi.value, srv.osp_id, soi.stya_id, eax.estpt_id, eax.discount, seo.id AS sero_id FROM estimate_attr_xref eax
JOIN attribute_types attl ON attl.id = eax.attr_id
JOIN object_attr_type_links oatl ON oatl.attr_id = attl.id
JOIN service_type_attributes sta ON sta.objt_attr_id = oatl.id
JOIN srv_obj_intermediate soi ON soi.stya_id = sta.id
AND ((soi.value = 0) OR (soi.value = 1 AND festpae_id IS NOT NULL))
JOIN service_objects seo ON seo.id = soi.sero_id
JOIN services srv ON srv.id = seo.srv_id
JOIN order_event oet ON oet.code = 'INTERMEDIATE'
WHERE eax.rate = 1 AND eax.ordet_id = oet.id
AND eax.objt_attr_id = sta.objt_attr_id) WHERE value = 1
UNION
SELECT soa.value, srv.osp_id, soa.stya_id, eax.estpt_id, eax.discount, seo.id AS sero_id FROM estimate_attr_xref eax
JOIN attribute_types attl ON attl.id = eax.attr_id
JOIN object_attr_type_links oatl ON oatl.attr_id = attl.id
JOIN service_type_attributes sta ON sta.objt_attr_id = oatl.id
JOIN srv_obj_attributes soa ON soa.stya_id = sta.id
AND soa.value = 1
LEFT JOIN srv_obj_intermediate soi ON soi.stya_id = sta.id
AND soi.value = 1
JOIN service_objects seo ON seo.id = soa.sero_id
JOIN services srv ON srv.id = seo.srv_id
JOIN order_event oet ON oet.code = 'INITIAL'
WHERE eax.rate = 1 AND eax.ordet_id = oet.id
AND eax.objt_attr_id = sta.objt_attr_id AND soi.value IS NULL

At a quick glance and without having any insight into your data, relationships, volumes, indexes, partitioning, cpu etc.
(1) In the first outer select (i.e. before UNION), you seem to have a filter WHERE VALUE = 1 where value is actually soe.value. In the inner select you have a condition ((soi.value = 0) OR (soi.value = 1 AND festpae_id IS NOT NULL)). Wouldn't it suffice to just use soi.value = 1 AND festpae_id IS NOT NULL in the inner select and avoid an outer select? What soi.value are you looking for?
(2) Similarly, in the second select you have LEFT JOIN srv_obj_intermediate soi ON soi.stya_id = sta.id and further down you have a filter AND soi.value IS NULL. Again, what soi.value are you looking for?
(3) Consider moving the oet.code filter predicate under the where clause and use JOIN order_event oet ON eax.ordet_id = oet.id for reasons mentioned here, although this doesn't guarantee peformance improvement. You'll need to review if and how the execution plan changes in each case.
(4) Are stats up-to-date on all these tables?
(5) Have you reviewed the plan? Are you missing any joins and/or having cartesian joins in the plan? Are you seeing full table scans when you expect usage of an index or expect partition pruning? This white paper is a good starting point if you're unfamiliar with explain plans.

Related

How to improve SQL inner join performance?

How improve this query performance second table CustomerAccountBrand inner join
taking long time. I have added Non clustered index that is not use. Is this is split two inner join after that able concatenate?. Please any one help to get that data.
SELECT DISTINCT
RA.AccountNumber,
RA.ShipTo,
RA.SystemCode,
CAB.BrandCode
FROM dbo.CustomerAccountRelatedAccounts RA -- Views
INNER JOIN dbo.CustomerAccount CA
ON RA.RelatedAccountNumber = CA.AccountNumber
AND RA.RelatedShipTo = CA.ShipTo
AND RA.RelatedSystemCode = CA.SystemCode
INNER JOIN dbo.CustomerAccountBrand CAB ---- Taking long time 4:30 mins
ON CA.AccountNumber = CAB.AccountNumber
AND CA.ShipTo = CAB.ShipTo
AND CA.SystemCode = CAB.SystemCode
ALTER VIEW [dbo].[CustomerAccountRelatedAccounts]
AS
SELECT
ca.AccountNumber, ca.ShipTo, ca.SystemCode, cafg.AccountNumber AS RelatedAccountNumber, cafg.ShipTo AS RelatedShipTo,
cafg.SystemCode AS RelatedSystemCode
FROM dbo.CustomerAccount AS ca
LEFT OUTER JOIN dbo.CustomerAccount AS cafg
ON ca.FinancialGroup = cafg.FinancialGroup
AND ca.NationalAccount = cafg.NationalAccount
AND cafg.IsActive = 1
WHERE CA.IsActive = 1
From my experience, the SQL server query optimizer often fails to pick the correct join algorithm when queries become more complex (e.g. joining with your view means that there's no index readily available to join on). If that's what's happening here, then the easy fix is to add a join hint to turn it into a hash join:
SELECT DISTINCT
RA.AccountNumber,
RA.ShipTo,
RA.SystemCode,
CAB.BrandCode
FROM dbo.CustomerAccountRelatedAccounts RA -- Views
INNER JOIN dbo.CustomerAccount CA
ON RA.RelatedAccountNumber = CA.AccountNumber
AND RA.RelatedShipTo = CA.ShipTo
AND RA.RelatedSystemCode = CA.SystemCode
INNER HASH JOIN dbo.CustomerAccountBrand CAB ---- Note the "HASH" keyword
ON CA.AccountNumber = CAB.AccountNumber
AND CA.ShipTo = CAB.ShipTo
AND CA.SystemCode = CAB.SystemCode

Simplifying where clause when a column is mutual for the tables at from clause

I would like to learn if there is any more efficient way to write the query below:
SELECT *
FROM requests srp
INNER JOIN surgeons rpsur
ON rpsur.id = srp.surgeon_id
LEFT OUTER JOIN #usersurgeons usersurgeons
ON usersurgeons.surgeon_id = srp.surgeon_id
LEFT OUTER JOIN surgeons LOsurgeons
ON usersurgeons.surgeon_id = LOsurgeons.id
LEFT OUTER JOIN provsurgeons LOprovsurgeons
ON LOprovsurgeons.id = LOsurgeons.provsurgeon_id
INNER JOIN #selectedsurgeons up
ON up.surgeon_id = rpsur.id
LEFT OUTER JOIN provsurgeons ps
ON ps.id = rpsur.provsurgeon_id
WHERE rpsur.isprimary = 0
AND usersurgeons.isprimary = 0
AND LOsurgeons.isprimary = 0
AND LOprovsurgeons.isprimary = 0
AND up.isprimary = 0
AND ps.isprimary = 0
I am not happy with the where clause here, is there any more professional way to write this, rather than adding the clauses to the join lines (such as on xx.id = yy.id and xx.isPrimary=0)??
From this query alone there are not many things that can be said. You should consider adding some more context (how do you get data into those temporary tables and the structure of %surgeons tables):
1) Select * makes almost impossible to use any index and also provides a lot of columns (Requests.*, surgeons.*, Provsurgeons.* etc.) in your final result. Return only the columns that you need.
2) If isPrimary = 0 filtering is performed often in your queries (not just this one), you can consider creating a view that fetches data filtered by isPrimary = 0. E.g. vwSurgeons, vwProvsurgeons. Then, you can just JOIN directly to the view instead of the table.
3) [already mentioned in the comments] Any condition that excludes NULL values for the OUTER JOINed table will transform the OUTER into INNER.
Instead of joining all tables and having a where clause at the end, use a derived tables only with filtered records. This way your query performance will be better.
SELECT *
FROM requests srp
INNER JOIN surgeons rpsur
ON rpsur.id = srp.surgeon_id
LEFT OUTER JOIN
(
SELECT *
FROM #usersurgeons
WHERE isprimary = 0
)usersurgeons
ON usersurgeons.surgeon_id = srp.surgeon_id
LEFT OUTER JOIN
(
SELECT *
FROM surgeons
WHERE isprimary = 0
)LOsurgeons
ON usersurgeons.surgeon_id = LOsurgeons.id
LEFT OUTER JOIN
(
SELECT *
FROM provsurgeons
WHERE isprimary = 0
)LOprovsurgeons
ON LOprovsurgeons.id = LOsurgeons.provsurgeon_id
INNER JOIN
(
SELECT *
FROM #selectedsurgeons
WHERE isprimary = 0
)up
ON up.surgeon_id = rpsur.id
LEFT OUTER JOIN
(
SELECT *
FROM provsurgeons
WHERE isprimary = 0
) ps
ON ps.id = rpsur.provsurgeon_id
WHERE rpsur.isprimary = 0

Why is subselect with CASE is faster than JOIN WITH OR in Oracle

I was optimizing one of horrible views we have and it came as surprise that one of subselects with CASE statements was running faster than LEFT JOIN with OR. Original view is substantially bigger but parts that I am interested in can be boiled down to following queries
SELECT CASE
WHEN tdcurr.productid = 1 THEN (SELECT addressid
FROM address a
WHERE a.customerid = tm.customerid
AND a.addressid =
tdcurr.addressid
AND a.addresstypeid = 3)
WHEN tdcurr.productid = 2 THEN (SELECT addressid
FROM address a
WHERE a.customerid = tm.customerid
AND a.addressid =
tdcurr.addressid
AND a.addresstypeid = 4)
END AS t_buyselladdressid
FROM vleaf_transactiondetail_all tdcurr
inner join transactionmain tm
ON tm.transactionid = tdcurr.transactionid
Execution plan
while one with join is consistently slower
SELECT bsaddr.addressid AS t_buyselladdressid
FROM vleaf_transactiondetail_all tdcurr
inner join transactionmain tm
ON tm.transactionid = tdcurr.transactionid
left outer join address bsaddr
ON tm.customerid = bsaddr.customerid
AND bsaddr.addressid = tdcurr.addressid
AND ( ( tdcurr.productid = 1
AND bsaddr.addresstypeid = 3 )
OR ( tdcurr.productid = 2
AND bsaddr.addresstypeid = 4 ) )
Execution Plan
Why would this be the case?
It's possible that the SQL with subselects is benefitting from scalar subquery caching. From the explain plans, it definitely looks like it's benefitting from not doing the Nested Loops Outer Join!
See https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2683853500346598211 for more information about scalar subquery caching.

Is there a better way to write this Oracle SQL query?

I have been using Oracle SQL for around 6 months so still a beginner. I need to query the database to get information on all items on a particular order (order number is via $_GET['id']).
I have come up with the below query, it works as expected and as I need but I do not know whether I am over complicating things which would slow the query down at all. I understand there are a number of ways to do a single thing and there may be better methods to write this query since I am a beginner.
I am using Oracle 8i (due to this is the version an application we use is supplied with) so I believe that some JOIN etc. are not available in this version, but is there a better way to write a query such as the below?
SELECT auf_pos.auf_pos,
(SELECT auf_stat.anz
FROM auf_stat
WHERE auf_stat.auf_pos = auf_pos.auf_pos
AND auf_stat.auf_nr = ".$_GET['id']."),
(SELECT auf_text.zl_str
FROM auf_text
WHERE auf_text.zl_mod = 0
AND auf_text.auf_pos = auf_pos.auf_pos
AND auf_text.auf_nr = ".$_GET['id']."),
(SELECT glas_daten_basis.gl_bez
FROM glas_daten_basis
WHERE glas_daten_basis.idnr = auf_pos.glas1),
(SELECT lzr_daten.lzr_breite
FROM lzr_daten
WHERE lzr_daten.lzr_idnr = auf_pos.lzr1),
(SELECT glas_daten_basis.gl_bez
FROM glas_daten_basis
WHERE glas_daten_basis.idnr = auf_pos.glas2),
auf_pos.breite,
auf_pos.hoehe,
auf_pos.spr_jn
FROM auf_pos
WHERE auf_pos.auf_nr = ".$_GET['id']."
Thanks in advance to any Oracle gurus that could help this beginner out!
You could rewrite it using joins. If your subselects aren't expected to return any NULL values, then you can use INNER JOINS:
SELECT auf_pos.auf_pos,
auf_stat.anz,
auf_text.zl_str,
glas_daten_basis.gl_bez,
lzr_daten.lzr_breite,
glas_daten_basis.gl_bez,
auf_pos.breite,
auf_pos.hoehe,
auf_pos.spr_jn
FROM auf_pos
INNER JOIN auf_stat ON auf_stat.auf_pos = auf_pos.auf_pos AND auf_stat.auf_nr = ".$_GET['id'].")
INNER JOIN auf_text ON auf_text.zl_mod = 0 AND auf_text.auf_pos = auf_pos.auf_pos AND auf_text.auf_nr = ".$_GET['id'].")
INNER JOIN glas_daten_basis ON glas_daten_basis.idnr = auf_pos.glas1
INNER JOIN lzr_daten ON lzr_daten.lzr_idnr = auf_pos.lzr1
INNER JOIN glas_daten_basis ON glas_daten_basis.idnr = auf_pos.glas2
Or if there are cases where you wouldn't have matches on all the tables, you could replace the INNER joins with LEFT OUTER joins:
SELECT auf_pos.auf_pos,
auf_stat.anz,
auf_text.zl_str,
glas_daten_basis.gl_bez,
lzr_daten.lzr_breite,
glas_daten_basis.gl_bez,
auf_pos.breite,
auf_pos.hoehe,
auf_pos.spr_jn
FROM auf_pos
LEFT OUTER JOIN auf_stat ON auf_stat.auf_pos = auf_pos.auf_pos AND auf_stat.auf_nr = ".$_GET['id'].")
LEFT OUTER JOIN auf_text ON auf_text.zl_mod = 0 AND auf_text.auf_pos = auf_pos.auf_pos AND auf_text.auf_nr = ".$_GET['id'].")
LEFT OUTER JOIN glas_daten_basis ON glas_daten_basis.idnr = auf_pos.glas1
LEFT OUTER JOIN lzr_daten ON lzr_daten.lzr_idnr = auf_pos.lzr1
LEFT OUTER JOIN glas_daten_basis ON glas_daten_basis.idnr = auf_pos.glas2
Whether or not you see any performance gains is debatable. As I understand it, the Oracle query optimizer should take your query and execute it with a similar plan to the join queries, but this is dependent on a number of factors, so the best thing to do it give it a try..

How to optimize query postgres

I am running the following query:
SELECT fat.*
FROM Table1 fat
LEFT JOIN modo_captura mc ON mc.id = fat.modo_captura_id
INNER JOIN loja lj ON lj.id = fat.loja_id
INNER JOIN rede rd ON rd.id = fat.rede_id
INNER JOIN bandeira bd ON bd.id = fat.bandeira_id
INNER JOIN produto pd ON pd.id = fat.produto_id
INNER JOIN loja_extensao le ON le.id = fat.loja_extensao_id
INNER JOIN conta ct ON ct.id = fat.conta_id
INNER JOIN banco bc ON bc.id = ct.banco_id
LEFT JOIN conciliacao_vendas cv ON fat.empresa_id = cv.empresa_id AND cv.chavefato = fat.chavefato AND fat.rede_id = cv.rede_id
WHERE 1 = 1
AND cv.controle_upload_arquivo_id = 6906
AND fat.parcela = 1
ORDER BY fat.data_venda, fat.data_credito limit 20
But very slowly. Here the Explain plan: http://explain.depesz.com/s/DnXH
Try this rewritten version:
SELECT fat.*
FROM Table1 fat
JOIN conciliacao_vendas cv USING (empresa_id, chavefato, rede_id)
JOIN loja lj ON lj.id = fat.loja_id
JOIN rede rd ON rd.id = fat.rede_id
JOIN bandeira bd ON bd.id = fat.bandeira_id
JOIN produto pd ON pd.id = fat.produto_id
JOIN loja_extensao le ON le.id = fat.loja_extensao_id
JOIN conta ct ON ct.id = fat.conta_id
JOIN banco bc ON bc.id = ct.banco_id
LEFT JOIN modo_captura mc ON mc.id = fat.modo_captura_id
WHERE cv.controle_upload_arquivo_id = 6906
AND fat.parcela = 1
ORDER BY fat.data_venda, fat.data_credito
LIMIT 20;
JOIN syntax and sequence of joins
In particular I fixed the misleading LEFT JOIN to conciliacao_vendas, which is forced to act as a plain [INNER] JOIN by the later WHERE condition anyways. This should simplify query planning and allow to eliminate rows earlier in the process, which should make everything a lot cheaper. Related answer with detailed explanation:
Explain JOIN vs. LEFT JOIN and WHERE condition performance suggestion in more detail
USING is just a syntactical shorthand.
Since there are many tables involved in the query and the order the rewritten query joins tables is optimal now, you can fine-tune this with SET LOCAL join_collapse_limit = 1 to save planning overhead and avoid inferior query plans. Run in a single transaction:
BEGIN;
SET LOCAL join_collapse_limit = 1;
SELECT ...; -- read data here
COMMIT; -- or ROOLBACK;
More about that:
Sample Query to show Cardinality estimation error in PostgreSQL
The fine manual on Controlling the Planner with Explicit JOIN Clauses
Index
Add some indexes on lookup tables with lots or rows (not necessary for just a couple of dozens), in particular (taken from your query plan):
Seq Scan on public.conta ct ... rows=6771
Seq Scan on public.loja lj ... rows=1568
Seq Scan on public.loja_extensao le ... rows=16394
That's particularly odd, because those columns look like primary key columns and should already have an index ...
So:
CREATE INDEX conta_pkey_idx ON public.conta (id);
CREATE INDEX loja_pkey_idx ON public.loja (id);
CREATE INDEX loja_extensao_pkey_idx ON public.loja_extensao (id);
To make this really fat, a multicolumn index would be of great service:
CREATE INDEX foo ON Table1 (parcela, data_venda, data_credito);