How to optimize query postgres - sql

I am running the following query:
SELECT fat.*
FROM Table1 fat
LEFT JOIN modo_captura mc ON mc.id = fat.modo_captura_id
INNER JOIN loja lj ON lj.id = fat.loja_id
INNER JOIN rede rd ON rd.id = fat.rede_id
INNER JOIN bandeira bd ON bd.id = fat.bandeira_id
INNER JOIN produto pd ON pd.id = fat.produto_id
INNER JOIN loja_extensao le ON le.id = fat.loja_extensao_id
INNER JOIN conta ct ON ct.id = fat.conta_id
INNER JOIN banco bc ON bc.id = ct.banco_id
LEFT JOIN conciliacao_vendas cv ON fat.empresa_id = cv.empresa_id AND cv.chavefato = fat.chavefato AND fat.rede_id = cv.rede_id
WHERE 1 = 1
AND cv.controle_upload_arquivo_id = 6906
AND fat.parcela = 1
ORDER BY fat.data_venda, fat.data_credito limit 20
But very slowly. Here the Explain plan: http://explain.depesz.com/s/DnXH

Try this rewritten version:
SELECT fat.*
FROM Table1 fat
JOIN conciliacao_vendas cv USING (empresa_id, chavefato, rede_id)
JOIN loja lj ON lj.id = fat.loja_id
JOIN rede rd ON rd.id = fat.rede_id
JOIN bandeira bd ON bd.id = fat.bandeira_id
JOIN produto pd ON pd.id = fat.produto_id
JOIN loja_extensao le ON le.id = fat.loja_extensao_id
JOIN conta ct ON ct.id = fat.conta_id
JOIN banco bc ON bc.id = ct.banco_id
LEFT JOIN modo_captura mc ON mc.id = fat.modo_captura_id
WHERE cv.controle_upload_arquivo_id = 6906
AND fat.parcela = 1
ORDER BY fat.data_venda, fat.data_credito
LIMIT 20;
JOIN syntax and sequence of joins
In particular I fixed the misleading LEFT JOIN to conciliacao_vendas, which is forced to act as a plain [INNER] JOIN by the later WHERE condition anyways. This should simplify query planning and allow to eliminate rows earlier in the process, which should make everything a lot cheaper. Related answer with detailed explanation:
Explain JOIN vs. LEFT JOIN and WHERE condition performance suggestion in more detail
USING is just a syntactical shorthand.
Since there are many tables involved in the query and the order the rewritten query joins tables is optimal now, you can fine-tune this with SET LOCAL join_collapse_limit = 1 to save planning overhead and avoid inferior query plans. Run in a single transaction:
BEGIN;
SET LOCAL join_collapse_limit = 1;
SELECT ...; -- read data here
COMMIT; -- or ROOLBACK;
More about that:
Sample Query to show Cardinality estimation error in PostgreSQL
The fine manual on Controlling the Planner with Explicit JOIN Clauses
Index
Add some indexes on lookup tables with lots or rows (not necessary for just a couple of dozens), in particular (taken from your query plan):
Seq Scan on public.conta ct ... rows=6771
Seq Scan on public.loja lj ... rows=1568
Seq Scan on public.loja_extensao le ... rows=16394
That's particularly odd, because those columns look like primary key columns and should already have an index ...
So:
CREATE INDEX conta_pkey_idx ON public.conta (id);
CREATE INDEX loja_pkey_idx ON public.loja (id);
CREATE INDEX loja_extensao_pkey_idx ON public.loja_extensao (id);
To make this really fat, a multicolumn index would be of great service:
CREATE INDEX foo ON Table1 (parcela, data_venda, data_credito);

Related

Is there any way to optimize this select

So i'm having this big select witch used with union. I'm thinking, is there any way to optimize it, because it's now quite heavy.
As you can see the main difference is between joined tables (srv_obj_intermediate and srv_obj_attributes) and they have 2 different oet.code
SELECT * FROM (SELECT soi.value, srv.osp_id, soi.stya_id, eax.estpt_id, eax.discount, seo.id AS sero_id FROM estimate_attr_xref eax
JOIN attribute_types attl ON attl.id = eax.attr_id
JOIN object_attr_type_links oatl ON oatl.attr_id = attl.id
JOIN service_type_attributes sta ON sta.objt_attr_id = oatl.id
JOIN srv_obj_intermediate soi ON soi.stya_id = sta.id
AND ((soi.value = 0) OR (soi.value = 1 AND festpae_id IS NOT NULL))
JOIN service_objects seo ON seo.id = soi.sero_id
JOIN services srv ON srv.id = seo.srv_id
JOIN order_event oet ON oet.code = 'INTERMEDIATE'
WHERE eax.rate = 1 AND eax.ordet_id = oet.id
AND eax.objt_attr_id = sta.objt_attr_id) WHERE value = 1
UNION
SELECT soa.value, srv.osp_id, soa.stya_id, eax.estpt_id, eax.discount, seo.id AS sero_id FROM estimate_attr_xref eax
JOIN attribute_types attl ON attl.id = eax.attr_id
JOIN object_attr_type_links oatl ON oatl.attr_id = attl.id
JOIN service_type_attributes sta ON sta.objt_attr_id = oatl.id
JOIN srv_obj_attributes soa ON soa.stya_id = sta.id
AND soa.value = 1
LEFT JOIN srv_obj_intermediate soi ON soi.stya_id = sta.id
AND soi.value = 1
JOIN service_objects seo ON seo.id = soa.sero_id
JOIN services srv ON srv.id = seo.srv_id
JOIN order_event oet ON oet.code = 'INITIAL'
WHERE eax.rate = 1 AND eax.ordet_id = oet.id
AND eax.objt_attr_id = sta.objt_attr_id AND soi.value IS NULL
At a quick glance and without having any insight into your data, relationships, volumes, indexes, partitioning, cpu etc.
(1) In the first outer select (i.e. before UNION), you seem to have a filter WHERE VALUE = 1 where value is actually soe.value. In the inner select you have a condition ((soi.value = 0) OR (soi.value = 1 AND festpae_id IS NOT NULL)). Wouldn't it suffice to just use soi.value = 1 AND festpae_id IS NOT NULL in the inner select and avoid an outer select? What soi.value are you looking for?
(2) Similarly, in the second select you have LEFT JOIN srv_obj_intermediate soi ON soi.stya_id = sta.id and further down you have a filter AND soi.value IS NULL. Again, what soi.value are you looking for?
(3) Consider moving the oet.code filter predicate under the where clause and use JOIN order_event oet ON eax.ordet_id = oet.id for reasons mentioned here, although this doesn't guarantee peformance improvement. You'll need to review if and how the execution plan changes in each case.
(4) Are stats up-to-date on all these tables?
(5) Have you reviewed the plan? Are you missing any joins and/or having cartesian joins in the plan? Are you seeing full table scans when you expect usage of an index or expect partition pruning? This white paper is a good starting point if you're unfamiliar with explain plans.

SQL JOIN: add custom constraint in JOIN clause

I want to select the preferred language if it exists and the default language otherwise.
SELECT a.code,
case
when vpi.program_items is not null then vpi.program_items else vpi2.program_items
end
FROM activity a
LEFT OUTER JOIN v_program_items vpi ON vpi.activity_id = a.id AND vpi.language = 'fr_BE'
LEFT OUTER JOIN v_program_items vpi2 ON vpi2.activity_id = a.id AND vpi2.language = 'fr'
WHERE a.id = 62170
The v_program_items table looks as :
- ID | language| program_items
- 62170 | fr | Présentation du club et des machines¤Briefing avant le vol¤45 minutes de vol en ULM
- 62170 | fr_BE | Un vol en ULM (45 min)
I use two JOIN (on the same table) and one CASE/WHEN.
Is it possible to use only one JOIN ?
The joins you have are fine and perform very well with an index - should be a UNIQUE index (or PK):
CREATE UNIQUE INDEX ON v_program_items (activity_id, language);
Use COALESCE in the SELECT list, like "PM 77-1" suggested in a comment:
SELECT a.code, COALESCE(v1.program_items, v2.program_items) AS program_items
FROM activity a
LEFT JOIN v_program_items v1 ON v1.activity_id = a.id AND v1.language = 'fr_BE'
LEFT JOIN v_program_items v2 ON v2.activity_id = a.id AND v2.language = 'fr'
WHERE a.id = 62170;
In Postgres 11, and only if your table v_program_items is big, consider a covering index:
CREATE UNIQUE INDEX ON v_program_items (activity_id, language) INCLUDE (program_items);
Related:
Do covering indexes in PostgreSQL help JOIN columns?
Either way, while selecting only a single row (or a few), lowly correlated subqueries should be even faster. Simple, too:
SELECT a.code
, COALESCE((SELECT program_items FROM v_program_items WHERE activity_id = a.id AND language = 'fr_BE')
, (SELECT program_items FROM v_program_items WHERE activity_id = a.id AND language = 'fr')) AS program_items
FROM activity a
WHERE a.id = 62170
Your method is fine. If you wanted just one join, you could do it as:
SELECT a.code, vpi.program_items
FROM activity a LEFT JOIN
v_program_items vpi
ON vpi.activity_id = a.id A
WHERE a.id = 62170 AND vp.language in ('fr_BE', 'fr')
ORDER BY (vp.language = 'fr_BE') DESC
FETCH 1 ROW ONLY;
It is not clear that this would have better performance, though.

How to improve SQL inner join performance?

How improve this query performance second table CustomerAccountBrand inner join
taking long time. I have added Non clustered index that is not use. Is this is split two inner join after that able concatenate?. Please any one help to get that data.
SELECT DISTINCT
RA.AccountNumber,
RA.ShipTo,
RA.SystemCode,
CAB.BrandCode
FROM dbo.CustomerAccountRelatedAccounts RA -- Views
INNER JOIN dbo.CustomerAccount CA
ON RA.RelatedAccountNumber = CA.AccountNumber
AND RA.RelatedShipTo = CA.ShipTo
AND RA.RelatedSystemCode = CA.SystemCode
INNER JOIN dbo.CustomerAccountBrand CAB ---- Taking long time 4:30 mins
ON CA.AccountNumber = CAB.AccountNumber
AND CA.ShipTo = CAB.ShipTo
AND CA.SystemCode = CAB.SystemCode
ALTER VIEW [dbo].[CustomerAccountRelatedAccounts]
AS
SELECT
ca.AccountNumber, ca.ShipTo, ca.SystemCode, cafg.AccountNumber AS RelatedAccountNumber, cafg.ShipTo AS RelatedShipTo,
cafg.SystemCode AS RelatedSystemCode
FROM dbo.CustomerAccount AS ca
LEFT OUTER JOIN dbo.CustomerAccount AS cafg
ON ca.FinancialGroup = cafg.FinancialGroup
AND ca.NationalAccount = cafg.NationalAccount
AND cafg.IsActive = 1
WHERE CA.IsActive = 1
From my experience, the SQL server query optimizer often fails to pick the correct join algorithm when queries become more complex (e.g. joining with your view means that there's no index readily available to join on). If that's what's happening here, then the easy fix is to add a join hint to turn it into a hash join:
SELECT DISTINCT
RA.AccountNumber,
RA.ShipTo,
RA.SystemCode,
CAB.BrandCode
FROM dbo.CustomerAccountRelatedAccounts RA -- Views
INNER JOIN dbo.CustomerAccount CA
ON RA.RelatedAccountNumber = CA.AccountNumber
AND RA.RelatedShipTo = CA.ShipTo
AND RA.RelatedSystemCode = CA.SystemCode
INNER HASH JOIN dbo.CustomerAccountBrand CAB ---- Note the "HASH" keyword
ON CA.AccountNumber = CAB.AccountNumber
AND CA.ShipTo = CAB.ShipTo
AND CA.SystemCode = CAB.SystemCode

How to improve the performance of a SQL query even after adding indexes?

I am trying to execute the following sql query but it takes 22 seconds to execute. the number of returned items is 554192. I need to make this faster and have already put indexes in all the tables involved.
SELECT mc.name AS MediaName,
lcc.name AS Country,
i.overridedate AS Date,
oi.rating,
bl1.firstname + ' ' + bl1.surname AS Byline,
b.id BatchNo,
i.numinbatch ItemNumberInBatch,
bah.changedatutc AS BatchDate,
pri.code AS IssueNo,
pri.name AS Issue,
lm.neptunemessageid AS MessageNo,
lmt.name AS MessageType,
bl2.firstname + ' ' + bl2.surname AS SourceFullName,
lst.name AS SourceTypeDesc
FROM profiles P
INNER JOIN profileresults PR
ON P.id = PR.profileid
INNER JOIN items i
ON PR.itemid = I.id
INNER JOIN batches b
ON b.id = i.batchid
INNER JOIN itemorganisations oi
ON i.id = oi.itemid
INNER JOIN lookup_mediachannels mc
ON i.mediachannelid = mc.id
LEFT OUTER JOIN lookup_cities lc
ON lc.id = mc.cityid
LEFT OUTER JOIN lookup_countries lcc
ON lcc.id = mc.countryid
LEFT OUTER JOIN itembylines ib
ON ib.itemid = i.id
LEFT OUTER JOIN bylines bl1
ON bl1.id = ib.bylineid
LEFT OUTER JOIN batchactionhistory bah
ON b.id = bah.batchid
INNER JOIN itemorganisationissues ioi
ON ioi.itemorganisationid = oi.id
INNER JOIN projectissues pri
ON pri.id = ioi.issueid
LEFT OUTER JOIN itemorganisationmessages iom
ON iom.itemorganisationid = oi.id
LEFT OUTER JOIN lookup_messages lm
ON iom.messageid = lm.id
LEFT OUTER JOIN lookup_messagetypes lmt
ON lmt.id = lm.messagetypeid
LEFT OUTER JOIN itemorganisationsources ios
ON ios.itemorganisationid = oi.id
LEFT OUTER JOIN bylines bl2
ON bl2.id = ios.bylineid
LEFT OUTER JOIN lookup_sourcetypes lst
ON lst.id = ios.sourcetypeid
WHERE p.id = #profileID
AND b.statusid IN ( 6, 7 )
AND bah.batchactionid = 6
AND i.statusid = 2
AND i.isrelevant = 1
when looking at the execution plan I can see an step which is costing 42%. Is there any way I could get this to a lower threshold or any way that I can improve the performance of the whole query.
Remove the profiles table as it is not needed and change the WHERE clause to
WHERE PR.profileid = #profileID
You have a left outer join on the batchactionhistory table but also have a condition in your WHERE clause which turns it back into an inner join. Change you code to this:
LEFT OUTER JOIN batchactionhistory bah
ON b.id = bah.batchid
AND bah.batchactionid = 6
You don't need the batches table as it is used to join other tables which could be joined directly and to show the id in you SELECT which is also available in other tables. Make the following changes:
i.batchidid AS BatchNo,
LEFT OUTER JOIN batchactionhistory bah
ON i.batchidid = bah.batchid
Are any of the fields that are used in joins or the WHERE clause from tables that contain large amounts of data but are not indexed. If so try adding an index on at time to the largest table.
Do you need every field in the result - if you could loose one or to you maybe could reduce the number of tables further.
First, if this is not a stored procedure, make it one. That's a lot of text for sql server to complile.
Next, my experience is that "worst practices" are occasionally a good idea. Specifically, I have been able to improve performance by splitting large queries into a couple or three small ones and assembling the results.
If this query is associated with a .net, coldfusion, java, etc application, you might be able to do the split/re-assemble in your application code. If not, a temporary table might come in handy.

optimizer refuses to use the index

SELECT /*+ PARALLEL(cfe, 6) */
dpd.f_p_descr,
ef.t_a_code,
pd.p_name,
ef.t_q
FROM e_fact ef
INNER JOIN d_dim dd
ON ef.t_d_key = dd.d_key
INNER JOIN f_e cfe
ON ef.ref_id = cfe.t_id
AND ef.r_version = cfe.t_version
INNER JOIN d_dim dpd
ON dpd.d_key = ef.d_key
INNER JOIN p_dim pd
ON pd.p_key = ef.b_p_key
INNER JOIN r_dim rd
ON rd.r_key = ef.t_r_key
INNER JOIN f_t_dim ftd
ON ftd.t_key = cfe.t_key
WHERE dd.d_value = '19-OCT-2012'
AND dpd.f_d = 'XYZ'
AND ftd.s_id IN (201, 209)
AND rd.r_n = 'ABC'
I got this query from production, problem is optimizer refuses to use the index on f_e even when the hint is added (/*+ index(e.c_fact_idx12) */. What should be my approach and what all things I need to check for this. Is there any other way to tune this query? New to query tuning so help would be appreciated.
You are using e.c_fact_idx12, but the table alias e is not defined anywhere in the query !