SQL joins vs nested query - sql

This two SQL statements return equal results but first one is much slower than the second:
SELECT leading.email, kstatus.name, contacts.status
FROM clients
JOIN clients_leading ON clients.id_client = clients_leading.id_client
JOIN leading ON clients_leading.id_leading = leading.id_leading
JOIN contacts ON contacts.id_k_p = clients_leading.id_group
JOIN kstatus on contacts.status = kstatus.id_kstatus
WHERE (clients.email = 'some_email' OR clients.email1 = 'some_email')
ORDER BY contacts.date DESC;
SELECT leading.email, kstatus.name, contacts.status
FROM (
SELECT *
FROM clients
WHERE (clients.email = 'some_email' OR clients.email1 = 'some_email')
)
AS clients
JOIN clients_leading ON clients.id_client = clients_leading.id_client
JOIN leading ON clients_leading.id_leading = leading.id_leading
JOIN contacts ON contacts.id_k_p = clients_leading.id_group
JOIN kstatus on contacts.status = kstatus.id_kstatus
ORDER BY contacts.date DESC;
But I'm wondering why is it so? It looks like in the firt statement joins are done first and then WHERE clause is applied and in second is just the opposite. But will it behave the same way on all DB engines (I tested it on MySQL)?
I was expecting DB engine can optimize queries like the fors one and firs apply WHERE clause and then make joins.

There are a lot of different reasons this could be (keying etc), but you can look at the explain mysql command to see how the statements are being executed. If you can run that and if it still is a mystery post it.

You always can replace join with nested query... It's always faster but lot messy...

Related

Oracle complex query with multiple joins on same table

I am dealing with a monster query ( ~800 lines ) on oracle 11, and its taking expensive resources.
The main problem here is a table mouvement with about ~18 million lines, on which I have like 30 left joins on this table.
LEFT JOIN mouvement mracct_ad1
ON mracct_ad1.code_portefeuille = t.code_portefeuille
AND mracct_ad1.statut_ligne = 'PROPRE'
AND substr(mracct_ad1.code_valeur,1,4) = 'MRAC'
AND mracct_ad1.code_transaction = t.code_transaction
LEFT JOIN mouvement mracct_zias
ON mracct_zias.code_portefeuille = t.code_portefeuille
AND mracct_zias.statut_ligne = 'PROPRE'
AND substr(mracct_zias.code_valeur,1,4) = 'PRAC'
AND mracct_zias.code_transaction = t.code_transaction
LEFT JOIN mouvement mracct_zixs
ON mracct_zias.code_portefeuille = t.code_portefeuille
AND mracct_zias.statut_ligne = 'XROPRE'
AND substr(mracct_zias.code_valeur,1,4) = 'MRAT'
AND mracct_zias.code_transaction = t.code_transaction
is there some way so I can get rid of the left joins, (union join or example) to make the query faster and consumes less? execution plan or something?
Just a note on performance. Usually you want to "rephrase" conditions like:
AND substr(mracct_ad1.code_valeur,1,4) = 'MRAC'
In simple words, expressions on the left side of the equality will prevent the best usage of indexes and may push the SQL optimizer toward a less than optimal plan. The database engine will end up doing more work than is really needed, and the query will be [much] slower. In extreme cases they can even decide to use a Full Table Scan. In this case you can rephrase it as:
AND mracct_ad1.code_valeur like 'MRAC%'
or:
AND mracct_ad1.code_valeur >= 'MRAC' AND mracct_ad1.code_valeur < 'MRAD'
I am guessing so. Your code sample doesn't make much sense, but you can probably do conditional aggregation:
left join
(select m.code_portefeuille, m.code_transaction,
max(case when m.statut_ligne = 'PROPRE' and m.code_valeur like 'MRAC%' then ? end) as ad1,
max(case when m.statut_ligne = 'PROPRE' and m.code_valeur like 'MRAC%' then ? end) as zia,
. . . -- for all the rest of the joins as well
from mouvement m
group by m.code_portefeuille, m.code_transaction
) m
on m.code_portefeuille = t.code_portefeuille and m.code_transaction = t.code_transaction
You can probably replace all 30 joins with a single join to the aggregated table.

Trying to write SQL query where I need to join 5 tables

I am fairly new to SQL. I am working on a query script where I need to join date from five tables. I have no way to test this, so I am not sure if it's right.
SELECT tblQCC.CSN,
tblQCC.question_id,
tblQCC.answer,
tblEncounters.CSN,
tblEncounters.department_id,
tblEncounters.prc_id,
tblEncounters.patient_id,
tblEncounters.account_id,
tblEncounters.ser_id,
tblEncounters.visit_date,
tblPatient.patient_id,
tblPatient.patient_name_last,
tblPatient.patient_name_first,
tblPatient.MRN,
tblPatient.DOB,
tblAccount.account_id,
tblAccount.benefit_plan_name,
tblSer.ser_id,
tblSer.provider_name
FROM theQCC
JOIN tblEncounters
ON tblQCC.CSN = tblEncouter.CSN
JOIN tblPatient
ON tblEncounters.patient_id = tblPatient.patient_id
JOIN tblAccount
ON tblEncounters.account_id = tblAccount.account_id
JOIN tblSer
ON tblEncounters.ser_id = tblSer.ser_id
WHERE tblEncounters.depatement_id = 500
AND tblQCC.answer = ‘yes’
AND tblEncounters.visit_date <= 2016-12-10;
First off, what database?
Specify your JOINs. Are they INNER, OUTER, LEFT, RIGHT?
Do you have tables with acceptable null values coming in?
Quotes around the date part ... etc.

The "where" condition worked not as expected ("or" issue)

I have a problem to join thoses 4 tables
Model of my database
I want to count the number of reservations with different sorts (user [mrbs_users.id], room [mrbs_room.room_id], area [mrbs_area.area_id]).
Howewer when I execute this query (for the user (id=1) )
SELECT count(*)
FROM mrbs_users JOIN mrbs_entry ON mrbs_users.name=mrbs_entry.create_by
JOIN mrbs_room ON mrbs_entry.room_id = mrbs_room.id
JOIN mrbs_area ON mrbs_room.area_id = mrbs_area.id
WHERE mrbs_entry.start_time BETWEEN "145811700" and "1463985000"
or
mrbs_entry.end_time BETWEEN "1458120600" and "1463992200" and mrbs_users.id = 1
The result is the total number of reservations of every user, not just the user who has the id = 1.
So if anyone could help me.. Thanks in advance.
Use parentheses in the where clause whenever you have more than one condition. Your where is parsed as:
WHERE (mrbs_entry.start_time BETWEEN "145811700" and "1463985000" ) or
(mrbs_entry.end_time BETWEEN "1458120600" and "1463992200" and
mrbs_users.id = 1
)
Presumably, you intend:
WHERE (mrbs_entry.start_time BETWEEN 145811700 and 1463985000 or
mrbs_entry.end_time BETWEEN 1458120600 and 1463992200
) and
mrbs_users.id = 1
Also, I removed the quotes around the string constants. It is bad practice to mix data types, and in some databases, the conversion between types can make the query less efficient.
The problem you've faced caused by the incorrect condition WHERE.
So, should be:
WHERE (mrbs_entry.start_time BETWEEN 145811700 AND 1463985000 )
OR
(mrbs_entry.end_time BETWEEN 1458120600 AND 1463992200 AND mrbs_users.id = 1)
Moreover, when you use only INNER JOIN (JOIN) then it be better to avoid WHERE clause, because the ON clause is executed before the WHERE clause, so criteria there would perform faster.
Your query in this case should be like this:
SELECT COUNT(*)
FROM mrbs_users
JOIN mrbs_entry ON mrbs_users.name=mrbs_entry.create_by
JOIN mrbs_room ON mrbs_entry.room_id = mrbs_room.id
AND
(mrbs_entry.start_time BETWEEN 145811700 AND 1463985000
OR ( mrbs_entry.end_time BETWEEN 1458120600 AND 1463992200 AND mrbs_users.id = 1)
)
JOIN mrbs_area ON mrbs_room.area_id = mrbs_area.id

Oracle Sub-select taking a long time

I have a SQL Query that comprise of two level sub-select. This is taking too much time.
The Query goes like:
select * from DALDBO.V_COUNTRY_DERIV_SUMMARY_XREF
where calculation_context_key = 130205268077
and DERIV_POSITION_KEY in
(select ctry_risk_derivs_psn_key
from DALDBO.V_COUNTRY_DERIV_PSN
where calculation_context_key = 130111216755
--and ctry_risk_derivs_psn_key = 76296412
and CREDIT_PRODUCT_TYPE = 'SWP OP'
and CALC_OBLIGOR_COUNTRY_OF_ASSETS in
(select ctry_cd
from DALDBO.V_PSN_COUNTRY
where calculation_context_key = 130134216755
--and ctry_risk_derivs_psn_key = 76296412
)
)
These tables are huge! Is there any optimizations available?
Without knowing anything about your table or view definitions, indexing, etc. I would start by looking at the sub-selects and ensuring that they are performing optimally. I would also want to know how many values are being returned by each sub-select as this can impact performance.
How is calculation_context_key used to retrieve rows from V_COUNTRY_DERIV_PSN and V_PSN_COUNTRY? Is it an optimal execution plan?
How is DERIV_POSITION_KEY and CALC_OBLIGOR_COUNTRY_OF_ASSETS used in V_COUNTRY_DERIV_SUMMARY_XREF to retrieve rows? Again, look at the explain plan.
first of all, can you write this query using inner joins (and not subselect) ??
select A.*
from DALDBO.V_COUNTRY_DERIV_SUMMARY_XREF a,
DALDBO.V_COUNTRY_DERIV_PSN b,
DALDBO.V_PSN_COUNTRY c
where calculation_context_key = 130205268077
and a.DERIV_POSITION_KEY = b.ctry_risk_derivs_psn_key
and b.calculation_context_key = 130111216755
--and b.ctry_risk_derivs_psn_key = 76296412
and b.CREDIT_PRODUCT_TYPE = 'SWP OP'
and b.CALC_OBLIGOR_COUNTRY_OF_ASSETS = c.ctry_cd
and c.calculation_context_key = 130134216755
--and c.ctry_risk_derivs_psn_key = 76296412
second, best practice says that when you don't query any data from the tables in the subselect you better of using an EXISTS instead of IN. new versions of oracle does that automatically and actually rewrite the whole thing as an inner join.
last, without any knowledge on you data and of what you are trying to do i would suggest you to try and use views as less as you can - if you can query the underling tables it would be best and you will probably see immediate performance improvement.

Bad performance of SQL query due to ORDER BY clause

I have a query joining 4 tables with a lot of conditions in the WHERE clause. The query also includes ORDER BY clause on a numeric column. It takes 6 seconds to return which is too long and I need to speed it up. Surprisingly I found that if I remove the ORDER BY clause it takes 2 seconds. Why the order by makes so massive difference and how to optimize it? I am using SQL server 2005. Many thanks.
I cannot confirm that the ORDER BY makes big difference since I am clearing the execution plan cache. However can you shed light at how to speed this up a little bit? The query is as follows (for simplicity there is "SELECT *" but I am only selecting the ones I need).
SELECT *
FROM View_Product_Joined j
INNER JOIN [dbo].[OPR_PriceLookup] pl on pl.siteID = NodeSiteID and pl.skuid = j.skuid
LEFT JOIN [dbo].[OPR_InventoryRules] irp on irp.ID = pl.SkuID and irp.InventoryRulesType = 'Product'
LEFT JOIN [dbo].[OPR_InventoryRules] irs on irs.ID = pl.siteID and irs.InventoryRulesType = 'Store'
WHERE (((((SiteName = N'EcommerceSite') AND (Published = 1)) AND (DocumentCulture = N'en-GB')) AND (NodeAliasPath LIKE N'/Products/Cats/Computers/Computer-servers/%')) AND ((NodeSKUID IS NOT NULL) AND (SKUEnabled = 1) AND pl.PriceLookupID in (select TOP 1 PriceLookupID from OPR_PriceLookup pl2 where pl.skuid = pl2.skuid and (pl2.RoleID = -1 or pl2.RoleId = 13) order by pl2.RoleID desc)))
ORDER BY NodeOrder ASC
Why the order by makes so massive difference and how to optimize it?
The ORDER BY needs to sort the resultset which may take long if it's big.
To optimize it, you may need to index the tables properly.
The index access path, however, has its drawbacks so it can even take longer.
If you have something other than equijoins in your query, or the ranged predicates (like <, > or BETWEEN, or GROUP BY clause), then the index used for ORDER BY may prevent the other indexes from being used.
If you post the query, I'll probably be able to tell you how to optimize it.
Update:
Rewrite the query:
SELECT *
FROM View_Product_Joined j
LEFT JOIN
[dbo].[OPR_InventoryRules] irp
ON irp.ID = j.skuid
AND irp.InventoryRulesType = 'Product'
LEFT JOIN
[dbo].[OPR_InventoryRules] irs
ON irs.ID = j.NodeSiteID
AND irs.InventoryRulesType = 'Store'
CROSS APPLY
(
SELECT TOP 1 *
FROM OPR_PriceLookup pl
WHERE pl.siteID = j.NodeSiteID
AND pl.skuid = j.skuid
AND pl.RoleID IN (-1, 13)
ORDER BY
pl.RoleID desc
) pl
WHERE SiteName = N'EcommerceSite'
AND Published = 1
AND DocumentCulture = N'en-GB'
AND NodeAliasPath LIKE N'/Products/Cats/Computers/Computer-servers/%'
AND NodeSKUID IS NOT NULL
AND SKUEnabled = 1
ORDER BY
NodeOrder ASC
The relation View_Product_Joined, as the name suggests, is probably a view.
Could you please post its definition?
If it is indexable, you may benefit from creating an index on View_Product_Joined (SiteName, Published, DocumentCulture, SKUEnabled, NodeOrder).