MySQL query optimization and EXPLAIN for a noob - sql

I've been working with databases for a long time but I'm new to query optimization. I have the following query (some of it code-generated):
SELECT DISTINCT COALESCE(gi.start_time, '') start_time,
COALESCE(b.name, '') bank,
COALESCE(a.id, '') account_id,
COALESCE(a.account_number, '') account_number,
COALESCE(at.code, '') account_type,
COALESCE(a.open_date, '') open_date,
COALESCE(a.interest_rate, '') interest_rate,
COALESCE(a.maturity_date, '') maturity_date,
COALESCE(a.opening_balance, '') opening_balance,
COALESCE(a.has_e_statement, '') has_e_statement,
COALESCE(a.has_bill_pay, '') has_bill_pay,
COALESCE(a.has_overdraft_protection, '') has_overdraft_protection,
COALESCE(a.balance, '') balance,
COALESCE(a.business_or_personal, '') business_or_personal,
COALESCE(a.cumulative_balance, '') cumulative_balance,
COALESCE(c.customer_number, '') customer_number,
COALESCE(c.social_security_number, '') social_security_number,
COALESCE(c.name, '') customer_name,
COALESCE(c.phone, '') phone,
COALESCE(c.deceased, '') deceased,
COALESCE(c.do_not_mail, '') do_not_mail,
COALESCE(cdob.date_of_birth, '') date_of_birth,
COALESCE(ad.line1, '') line1,
COALESCE(ad.line2, '') line2,
COALESCE(ad.city, '') city,
COALESCE(s.name, '') state,
COALESCE(ad.zip, '') zip,
COALESCE(o.officer_number, '') officer_number,
COALESCE(o.name, '') officer_name,
COALESCE(po.line1, '') po_box,
COALESCE(po.city, '') po_city,
COALESCE(po_state.name, '') po_state,
COALESCE(po.zip, '') zip,
COALESCE(br.number, '') branch_number,
COALESCE(cd_type.code, '') cd_type,
COALESCE(mp.product_number, '') macatawa_product_number,
COALESCE(mp.product_name, '') macatawa_product_name,
COALESCE(pt.name, '') macatawa_product_type,
COALESCE(hhsc.name, '') harte_hanks_service_category,
COALESCE(mp.hoh_hierarchy, '') hoh_hierarchy,
COALESCE(cft.name, '') core_file_type,
COALESCE(oa.line1, '') original_address_line1,
COALESCE(oa.line2, '') original_address_line2,
COALESCE(uc.code, '') use_class
FROM account a
JOIN customer c ON a.customer_id = c.id
JOIN officer o ON a.officer_id = o.id
JOIN account_address aa ON aa.account_id = a.id
LEFT JOIN account_po_box apb ON apb.account_id = a.id
JOIN address ad ON aa.address_id = ad.id
JOIN original_address oa ON oa.address_id = ad.id
LEFT JOIN address po ON apb.address_id = po.id
JOIN state s ON s.id = ad.state_id
LEFT JOIN state po_state ON po_state.id = po.state_id
LEFT JOIN branch br ON a.branch_id = br.id
JOIN account_import ai ON a.account_import_id = ai.id
JOIN generic_import gi ON gi.id = ai.generic_import_id
JOIN import_bundle ib ON gi.import_bundle_id = ib.id
JOIN bank b ON b.id = ib.bank_id
LEFT JOIN customer_date_of_birth cdob ON cdob.customer_id = c.id
LEFT JOIN cd_type ON a.cd_type_id = cd_type.id
LEFT JOIN account_macatawa_product amp ON amp.account_id = a.id
LEFT JOIN macatawa_product mp ON mp.id = amp.macatawa_product_id
LEFT JOIN product_type pt ON pt.id = mp.product_type_id
LEFT JOIN harte_hanks_service_category hhsc
ON hhsc.id = mp.harte_hanks_service_category_id
LEFT JOIN core_file_type cft ON cft.id = mp.core_file_type_id
LEFT JOIN use_class uc ON a.use_class_id = uc.id
LEFT JOIN account_type at ON a.account_type_id = at.id
WHERE 1
AND gi.active = 1
AND b.id = 8 AND ib.is_finished = 1
ORDER BY a.id
LIMIT 10
And it's pretty slow. On my dev server it takes about a minute to run and on my production server, where there's more data, I can't get it to even finish. Here's what an EXPLAIN looks like:
http://i.stack.imgur.com/eR6lq.png
I know the basics of EXPLAIN. I know that it's good that I have something other than NULL for everything under key. But I don't know, overall, how much room for improvement my query has. I do know that Using temporary; Using filesort under Extra is bad, but I have no idea what to do about it.

It looks like you don't have indexes on most of your JOIN fields. Make sure every field that you use as a JOIN key has an index on both tables.
With 23 joins and what looks like only 2 relevant indexes, poor performance can be expected.
With no index to reference, the query engine is checking every row in both tables to compare them, which is obviously very inefficient.
edit:
For example, in your query you have
JOIN customer c ON a.customer_id = c.id
Make sure you have an index on a.customer_id AND customer.id. Having an index on both tables (on the JOINed fields) will exponentially speed up the query.

In addition to what #JNK mentioned in his answer about ensuring you have indexes, I have restructured your query and added the "STRAIGHT_JOIN" clause at the top which tells the optimizer to do the query in the order the tables are presented to it.
Since your query is based on the generic import, to import bundle to bank, I've moved THOSE to the front of the list... The where will pre-qualify THOSE records first instead of looking at all accounts that may never be part of the result. So, the join is now reversed from the generic import back to the account following the same relationships you started with.
I've also associated the respective JOIN / ON conditions directly under the table they were joining against for readability and following table relationships. I've also made it so the ON clause has Table1.ID = JoinedTable.ID... although some reversed and otherwise no big deal, knowing how something is based on the join INTO the other just allows easier readability.
So, ensure respective tables have indexes on whatever key column is the join, and from this sample query, make sure your GI table (alias) has an index on "Active", and your IB (alias) has an index on Is_Finished.
Lastly, your WHERE clause had WHERE 1 AND... no purpose of the "1", so I stripped that out.
SELECT STRAIGHT_JOIN DISTINCT
COALESCE(gi.start_time, '') start_time,
COALESCE(b.name, '') bank,
COALESCE(a.id, '') account_id,
COALESCE(a.account_number, '') account_number,
COALESCE(at.code, '') account_type,
COALESCE(a.open_date, '') open_date,
COALESCE(a.interest_rate, '') interest_rate,
COALESCE(a.maturity_date, '') maturity_date,
COALESCE(a.opening_balance, '') opening_balance,
COALESCE(a.has_e_statement, '') has_e_statement,
COALESCE(a.has_bill_pay, '') has_bill_pay,
COALESCE(a.has_overdraft_protection, '') has_overdraft_protection,
COALESCE(a.balance, '') balance,
COALESCE(a.business_or_personal, '') business_or_personal,
COALESCE(a.cumulative_balance, '') cumulative_balance,
COALESCE(c.customer_number, '') customer_number,
COALESCE(c.social_security_number, '') social_security_number,
COALESCE(c.name, '') customer_name,
COALESCE(c.phone, '') phone,
COALESCE(c.deceased, '') deceased,
COALESCE(c.do_not_mail, '') do_not_mail,
COALESCE(cdob.date_of_birth, '') date_of_birth,
COALESCE(ad.line1, '') line1,
COALESCE(ad.line2, '') line2,
COALESCE(ad.city, '') city,
COALESCE(s.name, '') state,
COALESCE(ad.zip, '') zip,
COALESCE(o.officer_number, '') officer_number,
COALESCE(o.name, '') officer_name,
COALESCE(po.line1, '') po_box,
COALESCE(po.city, '') po_city,
COALESCE(po_state.name, '') po_state,
COALESCE(po.zip, '') zip,
COALESCE(br.number, '') branch_number,
COALESCE(cd_type.code, '') cd_type,
COALESCE(mp.product_number, '') macatawa_product_number,
COALESCE(mp.product_name, '') macatawa_product_name,
COALESCE(pt.name, '') macatawa_product_type,
COALESCE(hhsc.name, '') harte_hanks_service_category,
COALESCE(mp.hoh_hierarchy, '') hoh_hierarchy,
COALESCE(cft.name, '') core_file_type,
COALESCE(oa.line1, '') original_address_line1,
COALESCE(oa.line2, '') original_address_line2,
COALESCE(uc.code, '') use_class
FROM
generic_import gi
JOIN import_bundle ib
ON gi.import_bundle_id = ib.id
JOIN bank b
ON ib.bank_id = b.id
JOIN account_import ai
ON gi.id = ai.generic_import_id
JOIN account a
ON ai.id = a.account_import_id
JOIN customer c
ON a.customer_id = c.id
LEFT JOIN customer_date_of_birth cdob
ON c.id = cdob.customer_id
JOIN officer o
ON a.officer_id = o.id
LEFT JOIN branch br
ON a.branch_id = br.id
LEFT JOIN cd_type
ON a.cd_type_id = cd_type.id
LEFT JOIN account_macatawa_product amp
ON a.id = amp.account_id
LEFT JOIN macatawa_product mp
ON amp.macatawa_product_id = mp.id
LEFT JOIN product_type pt
ON mp.product_type_id = pt.id
LEFT JOIN harte_hanks_service_category hhsc
ON mp.harte_hanks_service_category_id = hhsc.id
LEFT JOIN core_file_type cft
ON mp.core_file_type_id = cft.id
LEFT JOIN use_class uc
ON a.use_class_id = uc.id
LEFT JOIN account_type at
ON a.account_type_id = at.id
JOIN account_address aa
ON a.id = aa.account_id
JOIN address ad
ON aa.address_id = ad.id
JOIN original_address oa
ON ad.id = oa.address_id
JOIN state s
ON ad.state_id = s.id
LEFT JOIN account_po_box apb
ON a.id = apb.account_id
LEFT JOIN address po
ON apb.address_id = po.id
LEFT JOIN state po_state
ON po.state_id = po_state.id
WHERE
gi.active = 1
AND ib.is_finished = 1
AND b.id = 8
ORDER BY
a.id
LIMIT
10

Related

Duplicated data on sql request

I'm having some issues with my sql request. They all work 1 by 1, but when i join them to make a global request it duplicate all my joined data.
Here is my actual global request, it work but not the way i would like.
I have already tried things but can't find the answer to my problem.
Thanks for your help.
SELECT films.titre,films.annee,films.description,films.image_film,
GROUP_CONCAT(genre.type SEPARATOR ', ') AS genre,
GROUP_CONCAT(realisateur.realisateur SEPARATOR ', ') AS realisateur,
GROUP_CONCAT(acteur.acteur SEPARATOR ', ') AS acteur
FROM film_genre
INNER JOIN films ON film_genre.film = films.id
INNER JOIN film_realisateur ON film_realisateur.film = films.id
INNER JOIN realisateur ON realisateur.id = film_realisateur.realisateur
INNER JOIN genre ON genre.id = film_genre.genre
INNER JOIN film_acteur ON film_acteur.film = films.id
INNER JOIN acteur ON acteur.id = film_acteur.acteur
GROUP BY films.titre
Here is the correct request, thank to Aman B :)
SELECT films.titre,films.annee,films.description,films.image_film,
GROUP_CONCAT(DISTINCT genre.type SEPARATOR ', ') AS genre,
GROUP_CONCAT(DISTINCT realisateur.realisateur SEPARATOR ', ') AS realisateur,
GROUP_CONCAT(DISTINCT acteur.acteur SEPARATOR ', ') AS acteur
FROM film_genre
INNER JOIN films ON film_genre.film = films.id
INNER JOIN film_realisateur ON film_realisateur.film = films.id
INNER JOIN realisateur ON realisateur.id = film_realisateur.realisateur
INNER JOIN genre ON genre.id = film_genre.genre
INNER JOIN film_acteur ON film_acteur.film = films.id
INNER JOIN acteur ON acteur.id = film_acteur.acteur
GROUP BY films.titre

Remove duplicate address_id from sql data set

I need to get only distinct address_id in result no duplication. Here is my query.
SELECT DISTINCT address.address_id, address.address1, address.streetcity, state.stateabbrev, rtrim(ltrim(case when address.streetzipcode is not null and address.streetzipcode != 'NULL' then address.streetzipcode else '' end))+case when len(address.streetzipplus4)>0 then '-'+rtrim(ltrim(address.streetzipplus4)) else '' end as streetzipcode, address.homephone,
dbo.f_addressstudent (student.address_id) as Students,
dbo.f_addresspeople (student.address_id) as Adults,
case
when #classif_id IS NULL then 0
else
student.classif_id
end classif,
classifctn
FROM district WITH(NOLOCK)
JOIN dbo.building ON building.district_id = district.district_id
JOIN dbo.studbldg_bridge WITH(NOLOCK) ON studbldg_bridge.bldg_id=building.bldg_id
JOIN dbo.student WITH(NOLOCK) ON student.student_id = studbldg_bridge.student_id
JOIN classif with(nolock) on student.classif_id = classif.classif_id
LEFT JOIN dbo.address WITH(NOLOCK) ON student.address_id = address.address_id
LEFT JOIN dbo.state WITH(NOLOCK) ON address.streetstate_id = state.state_id
LEFT JOIN dbo.state AS mailstate WITH(NOLOCK) ON address.state_id = mailstate.state_id
WHERE district.district_id = (SELECT district_id FROM dbo.building WITH(NOLOCK) WHERE bldg_id = #bldg_id)
ORDER BY classif,Adults, Students
Here is result of query
Query result with error in data
I have tried to group by and use aggregate function with address_id but I also have non-aggregate columns so it didn't worked for me.
After that I also tried using OVER(partition by address.address_id) but it also didn't worked.
Any help will be appreciated in advance.
Thank you
**UPDATE on Business logic/Requirements **
I need to get unique addresses for parents of students. As parent can have two or more children living in same address, it causes duplication. I need to get only one child per parent in other words.
From your image of the results it looks like the classifctn column has more than 1 value so it is repeating your row by that. In order to get 1 distinct address_id and rest of the columns either remove it from your query or you can set a precedence that will only return 1 record per address_id
further please tag only the RDBMs you ware actually using. MySQL for example doesn't have window functions yet you tagged it yet referenced using OVER(partition.... which would not be possible in mysql
;WITH cte (
SELECT DISTINCT address.address_id, address.address1, address.streetcity, state.stateabbrev, rtrim(ltrim(case when address.streetzipcode is not null and address.streetzipcode != 'NULL' then address.streetzipcode else '' end))+case when len(address.streetzipplus4)>0 then '-'+rtrim(ltrim(address.streetzipplus4)) else '' end as streetzipcode, address.homephone,
dbo.f_addressstudent (student.address_id) as Students,
dbo.f_addresspeople (student.address_id) as Adults,
case
when #classif_id IS NULL then 0
else
student.classif_id
end classif,
classifctn,
ROW_NUMBER() OVER (PARTITION BY address.address_id ORDER BY HOW WILL YOU CHOOSE?) AS RowNum
FROM district WITH(NOLOCK)
JOIN dbo.building ON building.district_id = district.district_id
JOIN dbo.studbldg_bridge WITH(NOLOCK) ON studbldg_bridge.bldg_id=building.bldg_id
JOIN dbo.student WITH(NOLOCK) ON student.student_id = studbldg_bridge.student_id
JOIN classif with(nolock) on student.classif_id = classif.classif_id
LEFT JOIN dbo.address WITH(NOLOCK) ON student.address_id = address.address_id
LEFT JOIN dbo.state WITH(NOLOCK) ON address.streetstate_id = state.state_id
LEFT JOIN dbo.state AS mailstate WITH(NOLOCK) ON address.state_id = mailstate.state_id
WHERE district.district_id = (SELECT district_id FROM dbo.building WITH(NOLOCK) WHERE bldg_id = #bldg_id)
)
SELECT *
FROM
cte
WHERE
RowNum = 1
ORDER BY
classif
,Adults
,Students
Alternatively you could nest your select query. note though this solution is somewhat useless as it will only return 1 grade/classifctn when more than 1 exists in a household if you really don't care about the column then you should just remove it from your query.
Actually both your classifctn and classif columns will cause you multiple rows when more than 1 student is at the same address. here is a way to concatenate those values to a single row. You should spend some more time on your business case and defining it for us. But here is one example for you:
SELECT DISTINCT
address.address_id
,address.address1
,address.streetcity
,state.stateabbrev
,LTRIM(RTRIM(ISNULL(NULLIF(address.streetzipcode,'NULL'),'')))
+ CASE WHEN LEN(address.streetzipplus4) > 0 THEN '-' ELSE '' END
+ LTRIM(RTRIM(ISNULL(address.streetzipplus4,''))) AS streetzipcode
,address.homephone
,dbo.f_addressstudent (student.address_id) as Students
,dbo.f_addresspeople (student.address_id) as Adults
, case
when #classif_id IS NULL then 0
else student.classif_id
end classif
,STUFF(
(SELECT ',' + CAST(classif_id AS VARCHAR(100))
FROM
classif c
WHERE c.classif = student.classif
FOR XML PATH(''))
,1,1,'') AS classifs
,STUFF(
(SELECT ',' + CAST(classifctn AS VARCHAR(100))
FROM
classif c
WHERE c.classif = student.classif
FOR XML PATH(''))
,1,1,'') AS classifctns
FROM
district WITH(NOLOCK)
INNER JOIN dbo.building
ON building.district_id = district.district_id
AND building.bldg_id = #bldg_id
INNER JOIN dbo.student WITH(NOLOCK)
ON student.student_id = studbldg_bridge.student_id
INNER JOIN dbo.address WITH(NOLOCK)
ON student.address_id = address.address_id
LEFT JOIN dbo.state WITH(NOLOCK)
ON address.streetstate_id = state.state_id
Note I when ahead and changed the zip code logic to show you some use of ISNULL() and NULLIF() that are helpful in cases like that. I also removed 3 tables because 2 are not used and the third ends up being used in a subselect to concatenate the values. Also address table was changed to an INNER JOIN because if an address doesn't exist all of the other information becomes blank/useless....
INNER JOIN dbo.studbldg_bridge WITH(NOLOCK) ON studbldg_bridge.bldg_id=building.bldg_id
LEFT JOIN dbo.state AS mailstate WITH(NOLOCK) ON address.state_id = mailstate.state_id
INNER JOIN classif with(nolock) on student.classif_id = classif.classif_id

how to select and print from each variable in another select sql query

I use this query and get a list of company names in many rows (each name in one row)
select distinct companyName from Companies
from each name I can use this query to get another property related to that company name:
SELECT distinct
STUFF((SELECT ', '+ cn.name
from WMCCMCategories cn
INNER JOIN CategorySets uc
ON uc.categoryId = cn.categoryID
INNER JOIN KeyProcesses u
ON u.categorySetId = uc.setId
INNER JOIN Companies c
ON c.companyId = u.companyId
WHERE c.companyName = #companyName
ORDER BY cn.name FOR XML PATH('')), 1, 1, '') AS listStr
FROM WMCCMCategories cnn
Group by cnn.name
Now, I want to apply that query for each name in the first query, so I replace #companyName by that first query:
SELECT distinct
STUFF((SELECT ', '+ cn.name
from WMCCMCategories cn
INNER JOIN CategorySets uc ON uc.categoryId = cn.categoryID
INNER JOIN KeyProcesses u ON u.categorySetId = uc.setId
INNER JOIN Companies c ON c.companyId = u.companyId
WHERE c.companyName in
(select distinct companyName from Companies)
ORDER BY cn.name FOR XML PATH('')), 1, 1, '') AS listStr
FROM
WMCCMCategories cnn
GROUP BY
cnn.name
But it will print all the results in one row. What I need is the result for each company names in one rows, and I can get each properties for each company Names. How could I modify to get that ?
Write as:
SELECT distinct
c1.companyName,
STUFF((SELECT ', '+ cn.name
from WMCCMCategories cn
INNER JOIN CategorySets uc
ON uc.categoryId = cn.categoryID
INNER JOIN KeyProcesses u
ON u.categorySetId = uc.setId
INNER JOIN Companies c
ON c.companyId = u.companyId
WHERE c.companyName = c1.companyName
ORDER BY cn.name FOR XML PATH('')), 1, 1, '') AS listStr
FROM Companies c1
Group by c1.companyName

SQL Error 1016 - Inner Joins

I'm trying to add inner joins to old SQL code to make it run more efficiently. But when I added them and tried to execute I get this error:
1016, Line 12 Outer join operators cannot be specified in a query containing joined tables
Here's the query:
select a.s_purchase_order as order_id,
a.order_type,
a.nobackorder,
a.order_note,
a.note,
a.rqst_dlvry_date,
b.customer_name ,
c.store_name,
(c.store_name + ',' + isnull(c.address1 + ',', ' ') + isnull(c.city + ',', ' ') + isnull(c.state_cd+ ',', ' ') + isnull( c.zipcode, ' ')) as store_info,
d.supplier_account
from VW_CustomerOrder a, Customer b, Store c, eligible_supplier d
where a.customer = c.customer
and a.store = c.store
and a.customer = b.customer
and c.customer *= d.customer
and c.store *= d.store
and a.supplier *= d.supplier
and a.purchase_order = #order_id
and a.customer = #customer_id
and a.store=#store_id
and a.supplier = #supplier_id
Any idea what's causing it? I'm guessing it has something to do with the isnull?
Did you try this? It replaces your commas between your tables with INNER JOIN and LEFT JOIN
select a.s_purchase_order as order_id,
a.order_type,
a.nobackorder,
a.order_note,
a.note,
a.rqst_dlvry_date,
b.customer_name ,
c.store_name,
(c.store_name + ',' + isnull(c.address1 + ',', ' ') + isnull(c.city + ',', ' ') + isnull(c.state_cd+ ',', ' ') + isnull( c.zipcode, ' ')) as store_info,
d.supplier_account
from VW_CustomerOrder a
INNER JOIN Customer b
ON a.customer = b.customer
INNER JOIN Store c
ON a.customer = c.customer
and a.store = c.store
LEFT JOIN eligible_supplier d
ON c.customer = d.customer
and c.store = d.store
and a.supplier = d.supplier
where a.purchase_order = #order_id
and a.customer = #customer_id
and a.store=#store_id
and a.supplier = #supplier_id
If you left your "*=" join operators in the code after you converted it to ANSI syntax, that would explain your error. Use = for all equality tests when using ANSI syntax -- the type of your JOIN should be explicit in the JOIN declaration itself (INNER, LEFT, RIGHT, etc.)

Why does removing the ORDER BY significantly speed up this query?

I have the following query (some of it is code-generated so pardon the poor formatting):
SELECT DISTINCT COALESCE(gi.start_time, '') start_time,
COALESCE(b.name, '') bank,
COALESCE(a.id, '') account_id,
COALESCE(a.account_number, '') account_number,
COALESCE(at.code, '') account_type,
COALESCE(a.open_date, '') open_date,
COALESCE(a.interest_rate, '') interest_rate,
COALESCE(a.maturity_date, '') maturity_date,
COALESCE(a.opening_balance, '') opening_balance,
COALESCE(a.has_e_statement, '') has_e_statement,
COALESCE(a.has_bill_pay, '') has_bill_pay,
COALESCE(a.has_overdraft_protection, '') has_overdraft_protection,
COALESCE(a.balance, '') balance,
COALESCE(a.business_or_personal, '') business_or_personal,
COALESCE(a.cumulative_balance, '') cumulative_balance,
COALESCE(c.customer_number, '') customer_number,
COALESCE(c.social_security_number, '') social_security_number,
COALESCE(c.name, '') customer_name,
COALESCE(c.phone, '') phone,
COALESCE(c.deceased, '') deceased,
COALESCE(c.do_not_mail, '') do_not_mail,
COALESCE(cdob.date_of_birth, '') date_of_birth,
COALESCE(ad.line1, '') line1,
COALESCE(ad.line2, '') line2,
COALESCE(ad.city, '') city,
COALESCE(s.name, '') state,
COALESCE(ad.zip, '') zip,
COALESCE(o.officer_number, '') officer_number,
COALESCE(o.name, '') officer_name,
COALESCE(po.line1, '') po_box,
COALESCE(po.city, '') po_city,
COALESCE(po_state.name, '') po_state,
COALESCE(po.zip, '') zip,
COALESCE(br.number, '') branch_number,
COALESCE(cd_type.code, '') cd_type,
COALESCE(mp.product_number, '') macatawa_product_number,
COALESCE(mp.product_name, '') macatawa_product_name,
COALESCE(pt.name, '') macatawa_product_type,
COALESCE(hhsc.name, '') harte_hanks_service_category,
COALESCE(mp.hoh_hierarchy, '') hoh_hierarchy,
COALESCE(cft.name, '') core_file_type,
COALESCE(oa.line1, '') original_address_line1,
COALESCE(oa.line2, '') original_address_line2,
COALESCE(uc.code, '') use_class
FROM account a
JOIN customer c ON a.customer_id = c.id
JOIN officer o ON a.officer_id = o.id
JOIN account_address aa ON aa.account_id = a.id
LEFT JOIN account_po_box apb ON apb.account_id = a.id
JOIN address ad ON aa.address_id = ad.id
JOIN original_address oa ON oa.address_id = ad.id
LEFT JOIN address po ON apb.address_id = po.id
JOIN state s ON s.id = ad.state_id
LEFT JOIN state po_state ON po_state.id = po.state_id
LEFT JOIN branch br ON a.branch_id = br.id
JOIN account_import ai ON a.account_import_id = ai.id
JOIN generic_import gi ON gi.id = ai.generic_import_id
JOIN import_bundle ib ON gi.import_bundle_id = ib.id
JOIN bank b ON b.id = ib.bank_id
LEFT JOIN customer_date_of_birth cdob ON cdob.customer_id = c.id
LEFT JOIN cd_type ON a.cd_type_id = cd_type.id
LEFT JOIN account_macatawa_product amp ON amp.account_id = a.id
LEFT JOIN macatawa_product mp ON mp.id = amp.macatawa_product_id
LEFT JOIN product_type pt ON pt.id = mp.product_type_id
LEFT JOIN harte_hanks_service_category hhsc ON hhsc.id = mp.harte_hanks_service_category_id
LEFT JOIN core_file_type cft ON cft.id = mp.core_file_type_id
LEFT JOIN use_class uc ON a.use_class_id = uc.id
LEFT JOIN account_type at ON a.account_type_id = at.id
WHERE 1
AND gi.active = 1
AND b.id = 8 AND ib.is_finished = 1
ORDER BY a.id
LIMIT 10
I have indexes on all the appropriate columns, including account.id AKA a.id. Despite this fact, my query significantly speeds up (it goes from 10 seconds to 0 seconds) if I remove the ORDER BY. Why is this?
Because with the ORDER BY, it has to retrieve all the rows to sort them to get the first 10 by a.id. Without the ORDER BY, it can simply retrieve the first 10 rows it finds and ignore the rest.
Also, be careful when profiling queries: the first can fill the cache with data, and subsequent queries go faster not because the SQL is different, but because it's pulling data from the cache instead of the disk.