Why my MariaDB query takes so long? How can I fix it? - sql

SELECT
ps_customer.id_customer
CONCAT(ps_customer.firstname," ",ps_customer.lastname) AS "Fullname"
ps_customer.email
ps_customer.date_add
ps_orders.id_order
ps_orders.id_address_delivery
ps_orders.payment
ps_stripepro_transaction.status
ps_order_detail.id_shop
ps_order_detail.product_name
ps_address.address1
ps_address.address2
ps_address.postcode
ps_address.city
ps_address.phone
ps_address.phone_mobile
ps_vd_customer_dogs.name
ps_vd_customer_dogs.size
ps_vd_customer_dogs.breed
ps_vd_customer_dogs.toy_destroyer
ps_order_cart_rule.name
ps_order_cart_rule.value
ps_order_detail.product_quantity
ps_message.message
FROM ps_orders
LEFT JOIN ps_customer
ON ps_orders.id_customer = ps_customer.id_customer
LEFT JOIN ps_stripepro_transaction
ON ps_orders.id_order = ps_stripepro_transaction.id_order
LEFT JOIN ps_order_detail
ON ps_orders.id_order = ps_order_detail.id_order
LEFT JOIN ps_address
ON ps_orders.id_customer = ps_address.id_customer
LEFT JOIN ps_vd_customer_dogs
ON ps_orders.id_customer = ps_vd_customer_dogs.id_customer
LEFT JOIN ps_order_cart_rule
ON ps_orders.id_order = ps_order_cart_rule.id_order
LEFT JOIN ps_message
ON ps_orders.id_customer = ps_message.id_customer
ORDER BY ps_orders.id_order DESC
This is my query. If I shorten it and reduce it to the first 4 left joins, I get the results in 10 min, but when I add the other joins I have to wait over 3 hours.
How can I improve this? My objective is to create a CSV with all the data I'm selecting so that we can improve many people's job in my office who have to check many parameters for each client in our PrestaShop backoffice.

Related

Improve query in stored procedure

I need to improve this query inside a stored procedure, the performance with lots of data broke the application. Is there any way to make it faster?
I need to collect certain columns from several tables to build a dashboard in my app web, others columns I collected in other query and joined in my controller through classes.
EDIT: this query is a part from dynamically transaction, they can change the database.
SELECT
a.fact_num as cotizacion, a.comentario, m.co_cli, k.cli_des,
m.co_ven, l.ven_des, m.fec_emis, m.fec_venc, m.campo8,
a.reng_num, a.co_art, g.art_des, a.co_alma, b.fact_num as pedido,
c.fact_num as factura, d.fact_num as despacho, e.cob_num as cobro,
f.fec_venc as fecha_venc, f.fec_emis as fecha_pedido,
h.odp_num as ord_produccion, h.co_ced as cedula, i.req_num as requisicion,
j.ent_num as cierre
FROM
reng_cac a
LEFT JOIN
cotiz_c m ON a.fact_num = m.fact_num
LEFT JOIN
reng_ped b ON a.co_art = b.co_art AND a.fact_num = b.num_doc AND b.tipo_doc = 'T'
LEFT JOIN
pedidos f ON b.fact_num = f.fact_num
LEFT JOIN
reng_fac c ON b.fact_num = c.num_doc AND a.co_art = c.co_art AND c.tipo_doc = 'P'
LEFT JOIN
reng_ndd d ON c.fact_num = d.num_doc AND a.co_art = d.co_art AND d.tipo_doc = 'F'
LEFT JOIN
reng_cob e ON c.fact_num = e.doc_num AND e.tp_doc_cob = 'FACT'
LEFT JOIN
art g ON a.co_art = g.co_art
LEFT JOIN
spodp h ON b.fact_num = h.doc_ori AND b.co_art = h.co_art
LEFT JOIN
spreqalm i ON h.odp_num = i.odp_num
LEFT JOIN
spcierre j ON h.odp_num = j.odp_num
LEFT JOIN
clientes k ON m.co_cli = k.co_cli
LEFT JOIN
vendedor l ON m.co_ven = l.co_ven
WHERE
a.fact_num BETWEEN '0' AND '999999999'
AND m.fec_emis BETWEEN '01/01/2012' AND '30/06/2012'
AND m.co_cli BETWEEN '' AND 'þþþþþþþþþþþþþþþþþþþþþþþþþþþþþþ'
ORDER BY
a.fact_num, a.reng_num ASC
As Nick commented, you always need to run your query in a execution plan. There are some points you need to consider. For instance, having a lot of LEFT JOINs will reduce the performance. Try to see if you can use INNER JOINs where possible. Check if you have proper indexes. If you can attach your execution plan to your question, you will get more helpful answers.

Reduce query complexity

I have this multi join query, is there any way to reduce the number of joins? Or maybe to split the query to 2 parts but then still get the same result set?
Having too many joins makes the query execute very slowly and inefficently
SELECT
MD.EVENT_ID,
AUI.ATLAS_USER_ID,
EVENT_TIME,
EVENT_TYPE,
INTERACTION_TOKEN,
CC.COUNTRY_CODE,
AP.ADV_PROJECT_ID, AP.ADV_PROJECT_NAME,
AC.ADV_CAMPAIGN_ID,
ADV_CAMPAIGN_NAME,
PC.PRT_CAMPAIGN_ID, PC.PRT_CAMPAIGN_NAME,
IP.IP_ADDRESS,
OS.OS,
BR.BROWSER,
FU.FULL_USER_AGENT,
RAW_ACTION_ID,
SE.SELLER_NETWORK_ID,
RIURL.RAW_INPUT as URL,
RIREF.RAW_INPUT as REF_URL,
MVG1.MAP_VALUE as TABOOLA ,
MVG2.MAP_VALUE as APPNEXUS ,
MVG3.MAP_VALUE as ETAG ,
MVG4.MAP_VALUE as FACEBOOK ,
MVG5.MAP_VALUE as MEDIAMATH ,
MVG6.MAP_VALUE as COOKIEID ,
MVG7.MAP_VALUE as IDFA ,
MVG8.MAP_VALUE as ADVLOGIN ,
MVG9.MAP_VALUE as OPENX ,
MVG10.MAP_VALUE as ADTRUTH ,
MVG11.MAP_VALUE as GOOGLE ,
MVG12.MAP_VALUE as ANDROID_ADV_ID ,
MVG13.MAP_VALUE as SDGUPI ,
MVG15.MAP_VALUE as RMX
FROM
EVENT_124_2 BASE
INNER JOIN
atlas__atlas_events MD ON BASE.EVENT_ID = MD.EVENT_ID
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG1 ON MVG1.MAP_VALUE_GK = BASE.TABOOLA_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG2 ON MVG2.MAP_VALUE_GK = BASE.APPNEXUS_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG3 ON MVG3.MAP_VALUE_GK = BASE.ETAG_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG4 ON MVG4.MAP_VALUE_GK = BASE.FACEBOOK_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG5 ON MVG5.MAP_VALUE_GK = BASE.MEDIAMATH_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG6 ON MVG6.MAP_VALUE_GK = BASE.COOKIEID_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG7 ON MVG7.MAP_VALUE_GK = BASE.IDFA_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG8 ON MVG8.MAP_VALUE_GK = BASE.ADVLOGIN_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG9 ON MVG9.MAP_VALUE_GK = BASE.OPENX_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG10 ON MVG10.MAP_VALUE_GK = BASE.ADTRUTH_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG11 ON MVG11.MAP_VALUE_GK = BASE.GOOGLE_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG12 ON MVG12.MAP_VALUE_GK = BASE.ANDROID_ADV_ID_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG13 ON MVG13.MAP_VALUE_GK = BASE.SDGUPI_HASH
LEFT JOIN
LOOKUP_STG__MAP_VALUE MVG15 ON MVG15.MAP_VALUE_GK = BASE.RMX_HASH
LEFT JOIN
LOOKUP_STG__ATLAS_USER_ID AUI ON AUI.ATLAS_USER_ID_GK = MD.ATLAS_USER_ID
LEFT JOIN
GAYA__ADV_PROJECTS AP ON AP.ADV_PROJECT_ID = MD.ADV_PROJECT_ID
LEFT JOIN
GAYA__ADV_CAMPAIGNS AC ON AC.ADV_CAMPAIGN_ID = MD.ADV_CAMPAIGN_ID
LEFT JOIN
GAYA__PRT_CAMPAIGNS PC ON PC.PRT_CAMPAIGN_ID = MD.PRT_CAMPAIGN_ID
LEFT JOIN
LOOKUP_STG__RAW_INPUT RIURL ON RIURL.RAW_INPUT_GK = BASE.URL_HASH
LEFT JOIN
LOOKUP_STG__RAW_INPUT RIREF ON RIREF.RAW_INPUT_GK = BASE.REF_HASH
LEFT JOIN
LOOKUP_STG__OS OS ON OS.OS_GK = MD.COUNTRY_CODE
LEFT JOIN
LOOKUP_STG__COUNTRY_CODE CC ON CC.COUNTRY_CODE_GK = MD.COUNTRY_CODE
LEFT JOIN
LOOKUP_STG__BROWSER BR ON BR.BROWSER_GK = MD.BROWSER
LEFT JOIN
LOOKUP_STG__IP_ADDRESS IP ON IP.IP_ADDRESS_GK = MD.IP_ADDRESS
LEFT JOIN
LOOKUP_STG__SELLER_NETWORK_ID SE ON SE.SELLER_NETWORK_ID_GK = MD.SELLER_NETWORK_ID
LEFT JOIN
LOOKUP_STG__FULL_USER_AGENT FU ON FU.FULL_USER_AGENT_GK = MD.FULL_USER_AGENT
;
Do you have any indexes on your key columns? You might try to put bitmap indexes on each key column in the main fact table and also in the dimension tables. That should give you a considerable speedup.
You say that the number of joins makes the query inefficient. But what you should be looking at is whether the tables that are joined with have the proper indexes in them. You obviously need all those fields from the tables you join with, so you can't simply not join with them.
For every JOIN you make in the query, verify that there exists an INDEX on the field that is joined with. For instance:
INNER JOIN
atlas__atlas_events MD ON BASE.EVENT_ID = MD.EVENT_ID
Does the table atlas__atlas_events have an INDEX on the EVENT_ID column?
You need to verify this for every such JOIN in your query. If such an INDEX does not exist you should create one.
If you execute this query in SQL Server Management Studio and include the actual execution plan, you will probably already see indications that you are missing indexes.

LEFT JOIN ON COALESCE(a, b, c) - very strange behavior

I have encountered very strange behavior of my query and I wasted a lot of time to understand what causes it, in vane. So I am asking for your help.
SELECT count(*) FROM main_table
LEFT JOIN front_table ON front_table.pk = main_table.fk_front_table
LEFT JOIN info_table ON info_table.pk = front_table.fk_info_table
LEFT JOIN key_table ON key_table.pk = COALESCE(info_table.fk_key_table, front_table.fk_key_table_1, front_table.fk_key_table_2)
LEFT JOIN side_table ON side_table.fk_front_table = front_table.pk
WHERE side_table.pk = (SELECT MAX(pk) FROM side_table WHERE fk_front_table = front_table.pk)
OR side_table.pk IS NULL
Seems like a simple join query, with coalesce, I've used this technique before(not too many times) and it worked right.
In this query I don't ever get nulls for side_table.pk. If I remove coalesce or just don't use key_table, then the query returns rows with many null side_table.pk, but if I add coalesce join, I can't get those nulls.
It seems key_table and side_table don't have anything in common, but the result is so weird.
Also, when I don't use side_table and WHERE clause, the count(*) result with coalesce and without differs, but I can't see any pattern in rows missing, it seems random!
Real query:
SELECT ECHANGE.EXC_AUTO_KEY, STOCK_RESERVATIONS.STR_AUTO_KEY FROM EXCHANGE
LEFT JOIN WO_BOM ON WO_BOM.WOB_AUTO_KEY = EXCHANGE.WOB_AUTO_KEY
LEFT JOIN VIEW_WO_SUB ON VIEW_WO_SUB.WOO_AUTO_KEY = WO_BOM.WOO_AUTO_KEY
LEFT JOIN STOCK stock3 ON stock3.STM_AUTO_KEY = EXCHANGE.STM_AUTO_KEY
LEFT JOIN STOCK stock2 ON stock2.STM_AUTO_KEY = EXCHANGE.ORIG_STM
LEFT JOIN CONSIGNMENT_CODES con2 ON con2.CNC_AUTO_KEY = stock2.CNC_AUTO_KEY
LEFT JOIN CONSIGNMENT_CODES con3 ON con3.CNC_AUTO_KEY = stock3.CNC_AUTO_KEY
LEFT JOIN CI_UTL ON CI_UTL.CUT_AUTO_KEY = EXCHANGE.CUT_AUTO_KEY
LEFT JOIN PART_CONDITION_CODES pcc2 ON pcc2.PCC_AUTO_KEY = stock2.PCC_AUTO_KEY
LEFT JOIN PART_CONDITION_CODES pcc3 ON pcc3.PCC_AUTO_KEY = stock3.PCC_AUTO_KEY
LEFT JOIN STOCK_RESERVATIONS ON STOCK_RESERVATIONS.STM_AUTO_KEY = stock3.STM_AUTO_KEY
LEFT JOIN WAREHOUSE wh2 ON wh2.WHS_AUTO_KEY = stock2.WHS_ORIGINAL
LEFT JOIN SM_HISTORY ON (SM_HISTORY.STM_AUTO_KEY = EXCHANGE.ORIG_STM AND SM_HISTORY.WOB_REF = EXCHANGE.WOB_AUTO_KEY)
LEFT JOIN RC_DETAIL ON stock3.RCD_AUTO_KEY = RC_DETAIL.RCD_AUTO_KEY
LEFT JOIN RC_HEADER ON RC_HEADER.RCH_AUTO_KEY = RC_DETAIL.RCH_AUTO_KEY
LEFT JOIN WAREHOUSE wh3 ON wh3.WHS_AUTO_KEY = COALESCE(RC_DETAIL.WHS_AUTO_KEY, stock3.WHS_ORIGINAL, stock3.WHS_AUTO_KEY)
WHERE STOCK_RESERVATIONS.STR_AUTO_KEY = (SELECT MAX(STR_AUTO_KEY) FROM STOCK_RESERVATIONS WHERE STM_AUTO_KEY = stock3.STM_AUTO_KEY)
OR STOCK_RESERVATIONS.STR_AUTO_KEY IS NULL
Removing LEFT JOIN WAREHOUSE wh3 gives me about unique EXC_AUTO_KEY values with a lot of NULL STR_AUTO_KEY, while leaving this row removes all NULL STR_AUTO_KEY.
I recreated simple tables with numbers with the same structure and query works without any problems o.0
I have a feeling COALESCE is acting as a REQUIRED flag for the joined table, hence shooting the LEFT JOIN to become an INNER JOIN.
Try this:
SELECT COUNT(*)
FROM main_table
LEFT JOIN front_table ON front_table.pk = main_table.fk_front_table
LEFT JOIN info_table ON info_table.pk = front_table.fk_info_table
LEFT JOIN key_table ON key_table.pk = NVL(info_table.fk_key_table, NVL(front_table.fk_key_table_1, front_table.fk_key_table_2))
LEFT JOIN (SELECT fk_, MAX(pk) as pk FROM side_table GROUP BY fk_) st ON st.fk_ = front_table.pk
NVL might behave just the same though...
I undertood what was the problem (not entirely though): there is a LEFT JOIN VIEW_WO_SUB in original query, 3rd line. It causes this query to act in a weird way.
When I replaced the view with the other table which contained the information I needed, the query started returning right results.
Basically, with this view join, NVL, COALESCE or CASE join with combination of certain arguments did not work along with OR clause in WHERE subquery, all rest was fine. ALthough, I could get the query to work with this view join, by changing the order of joined tables, I had to place table participating in where subquery to the bottom.

Reduce Runtime of T-SQL Query

-- WITH POD was causing the issue, removing this code reduced 2 year pull to 3 mins.
Will post new question to figure out best way to include POD data.
--Edit for clarity, I am a read only user to these tables.
I wrote the below query, but it takes a very long time to execute (20min).
It is currently limited to 1 month, but user wants at least 1 year preferably 2. I assume this would scale time to hours.
Can anyone take a look at let me know if there is a BKM I am not using to improve performance?
Or if there is a better method for a report of this size? At 2 years, it would return ~100K rows from 17 tables.
WITH
POD AS
(
SELECT SHIPMENTS.Delivery
,SHIPMENTS.Shipment_Number
,PROOF_OF_DELIVERY.Shipping_Carrier
,PROOF_OF_DELIVERY.Tracking_Number
,PROOF_OF_DELIVERY.Ship_Method
,PROOF_OF_DELIVERY.POD_Signature
,PROOF_OF_DELIVERY.POD_Date
,PROOF_OF_DELIVERY.POD_Time
FROM
SHIPMENTS
LEFT JOIN PROOF_OF_DELIVERY
ON SHIPMENTS.Shipment_Number = PROOF_OF_DELIVERY.Delivery_Or_Shipment
WHERE Load_Date IN
(
SELECT MAX(Load_Date)
FROM PROOF_OF_DELIVERY
GROUP BY Delivery_Or_Shipment
)
)
SELECT DISTINCT GI.GOODS_ISSUE_DOCUMENT_ID
,GI.SALES_ORDER_ID
,GI.SALES_ORDER_LINE_ID
,GI.SALES_ORDER_TYPE_CODE
,GI.DELIVERY_HEADER_ID
,GI.DELIVERY_ITEM_ID
,FD.FISCAL_MONTH_CODE
,GI.MATERIAL_NUMBER
,GI.SHIPPED_QTY
,SO.ORDERER_NAME
,SO.CREATED_BY
,SO.CONTACT_PERSON
,GI.SOLD_TO_CUSTOMER_ID
,GI.SHIP_TO_CUSTOMER_ID
,GI.ORIGINAL_COMMIT_DATE
,GI.SHIP_FROM_PLANT_ID
,GI.ACTUAL_PGI_DATE
,GI.CUSTOMER_PO_NUMBER
,GI.SHIPPED_PRICE
,(GI.SHIPPED_PRICE * GI.SHIPPED_QTY) AS EXT_SHIPPED_PRICE
,GI.SALES_ORGANIZATION_CODE
,GI.DELIVERY_NOTE_PRIORITY_CODE
,FD.FISCAL_WEEK_CODE
,DV.DIVISION_CODE
,DN.Delivery_Item_Creation_Date
,SOLD.CUSTOMER_SHORT_NAME AS SOLD_TO_CUSTOMER_SHORT_NAME
,SHIP.CUSTOMER_SHORT_NAME AS SHIP_TO_CUSTOMER_SHORT_NAME
,SHIP.Customer_Site_Name
,SHIP.REGION_NAME
,MATD.MATERIAL_DESCRIPTION
,MATD.STANDARD_COST
,(MATD.STANDARD_COST * GI.SHIPPED_QTY) AS EXT_STANDARD_COST
,MATD.GLOBAL_EVENT
,PLT.LEAD_TIME_FOR_ORIGINAL_COMMIT
,OPRM.BASE_PART_CODE
,MATD.PRODUCT_INSP_MEMO
,MATD.MATERIAL_PRICING_GROUP_CODE
,MATD.MATERIAL_STATUS AS MMPP
,PIM.PIM_PBG_GROUPING
,SOL.SHIPPING_CONDITION
,SVO.SERVICE_ORDER_NUM
,SO.CREATION_TIME AS SO_CREATION_TIME
,SOL.CREATED_TIME AS SO_LINE_CREATED_TIME
,SOL.SHIPPING_POINT
,SDT.SALES_DOCUMENT_TYPE_CODE AS SVO_DOCUMENT_TYPE_CODE
,EQU.EQUIPMENT_NUM
,EQU.SERIAL_NUMBER
,EQU.CUSTOMERTOOLID
,POD.Shipment_Number
,POD.Shipping_Carrier
,POD.Tracking_Number
,POD.Ship_Method
,POD.POD_Signature
,POD.POD_Date
,POD.POD_Time
,DATEDIFF(dd,SO.CREATION_TIME,GI.ACTUAL_PGI_DATE) AS Cycle_Time_to_PGI_Days
,DATEDIFF(hh,SO.CREATION_TIME,GI.ACTUAL_PGI_DATE) AS Cycle_Time_to_PGI_Hours
FROM GOODS_ISSUE AS GI
INNER JOIN dbo.Delivery_Notes AS DN
ON GI.DELIVERY_HEADER_ID = DN.DELIVERY_HEADER_CODE AND GI.DELIVERY_ITEM_ID = DN.DELIVERY_ITEM_CODE
INNER JOIN dbo.Customer_View AS SOLD
ON GI.SOLD_TO_CUSTOMER_ID = SOLD.CUSTOMER_CODE
INNER JOIN dbo.Customer_View AS SHIP
ON GI.SOLD_TO_CUSTOMER_ID = SHIP.CUSTOMER_CODE
INNER JOIN dbo.MATERIAL_DETAILS AS MATD
ON GI.MATERIAL_NUMBER = MATD.MATERIAL_NUMBER
INNER JOIN dbo.OPR_MATERIAL_DIM AS OPRM
ON OPRM.MATERIAL_NUMBER = GI.MATERIAL_NUMBER
LEFT JOIN dbo.SM_DATE_DIM AS FD
ON CAST(FD.CALENDAR_DAY AS DATE) = CAST(GI.ACTUAL_PGI_DATE AS DATE)
LEFT JOIN dbo.DIM_PUBLISHED_LEAD_TIME_COMMIT AS PLT
ON PLT.MATERIAL_NUMBER = OPRM.BASE_PART_CODE
LEFT JOIN dbo.PRODUCT_INSP_MEMO_DIM AS PIM
ON PIM.PRODUCT_INSP_MEMO = MATD.PRODUCT_INSP_MEMO
INNER JOIN dbo.SM_SALES_ORDER_LINE_FACT AS SOL
ON SOL.SALES_ORDER_CODE = GI.SALES_ORDER_ID AND SOL.SALES_ORDER_LINE_CODE = GI.SALES_ORDER_LINE_ID
INNER JOIN dbo.SM_SALES_ORDER_FACT AS SO
ON SO.SALES_ORDER_CODE = GI.SALES_ORDER_ID
INNER JOIN dbo.SM_DIVISION_DIM AS DV
ON SO.DIVISION_SID = DV.DIVISION_SID
LEFT JOIN dbo.SERVICE_ORDER_FACT AS SVO
ON SVO.SERVICE_ORDER_NUM = SO.SERVICE_ORDER_NUMBER
LEFT JOIN dbo.SM_SALES_DOCUMENT_TYPE_DIM AS SDT
ON SDT.SALES_DOCUMENT_TYPE_SID = SVO.SALES_DOCUMENT_TYPE_SID
LEFT JOIN dbo.SM_EQUIPMENT_DIM AS EQU
ON EQU.EQUIPMENT_SID = SVO.EQUIPMENT_SID
LEFT JOIN POD
ON POD.Delivery = GI.DELIVERY_HEADER_ID
WHERE GI.ACTUAL_PGI_DATE > GETDATE()-32
AND SOLD_TO_CUSTOMER_ID IN (0010000252,0010000898,0010001121,0010001409,0010001842,0010001852,0010001879,0010001977,0010001978,0010002021,0010002202,0010002227,0010002982,0010003118,0010003176,0010003294,0010005492,0010006904,0010007048,0010007080,0010010381,0010010572,0010010905,0010011999,0010012014,0010012048,0010012571,0010013124,0010013711,0010013713,0010013824,0010014180,0010014188,0010014333,0010015059,0010015313,0010015414,0010015541,0010015544,0010015550)
A CTE is just syntax
I suspect that CTE is evaluated many times
Materialze the CTE to #temp with indexe(s) so it is run once
This cast will hurt it
Make those columns true dates and index them
ON CAST(FD.CALENDAR_DAY AS DATE) = CAST(GI.ACTUAL_PGI_DATE AS DATE)
That where negates the left so you can just do a join
Also that MAX(Load_Date) could match on another shipment
SELECT SHIPMENTS.Delivery
,SHIPMENTS.Shipment_Number
,PROOF_OF_DELIVERY.Shipping_Carrier
,PROOF_OF_DELIVERY.Tracking_Number
,PROOF_OF_DELIVERY.Ship_Method
,PROOF_OF_DELIVERY.POD_Signature
,PROOF_OF_DELIVERY.POD_Date
,PROOF_OF_DELIVERY.POD_Time
FROM SHIPMENTS
JOIN PROOF_OF_DELIVERY
ON SHIPMENTS.Shipment_Number = PROOF_OF_DELIVERY.Delivery_Or_Shipment
WHERE PROOF_OF_DELIVERY.Load_Date IN
(
SELECT MAX(Load_Date)
FROM PROOF_OF_DELIVERY
GROUP BY Delivery_Or_Shipment
)
Pull this up into the join
INNER JOIN dbo.Customer_View AS SOLD
ON GI.SOLD_TO_CUSTOMER_ID = SOLD.CUSTOMER_CODE
AND GI.SOLD_TO_CUSTOMER_ID IN (0010000252,0010000898,0010001121,0010001409,0010001842,0010001852,0010001879,0010001977,0010001978,0010002021,0010002202,0010002227,0010002982,0010003118,0010003176,0010003294,0010005492,0010006904,0010007048,0010007080,0010010381,0010010572,0010010905,0010011999,0010012014,0010012048,0010012571,0010013124,0010013711,0010013713,0010013824,0010014180,0010014188,0010014333,0010015059,0010015313,0010015414,0010015541,0010015544,0010015550)

How to improve the performance of a SQL query even after adding indexes?

I am trying to execute the following sql query but it takes 22 seconds to execute. the number of returned items is 554192. I need to make this faster and have already put indexes in all the tables involved.
SELECT mc.name AS MediaName,
lcc.name AS Country,
i.overridedate AS Date,
oi.rating,
bl1.firstname + ' ' + bl1.surname AS Byline,
b.id BatchNo,
i.numinbatch ItemNumberInBatch,
bah.changedatutc AS BatchDate,
pri.code AS IssueNo,
pri.name AS Issue,
lm.neptunemessageid AS MessageNo,
lmt.name AS MessageType,
bl2.firstname + ' ' + bl2.surname AS SourceFullName,
lst.name AS SourceTypeDesc
FROM profiles P
INNER JOIN profileresults PR
ON P.id = PR.profileid
INNER JOIN items i
ON PR.itemid = I.id
INNER JOIN batches b
ON b.id = i.batchid
INNER JOIN itemorganisations oi
ON i.id = oi.itemid
INNER JOIN lookup_mediachannels mc
ON i.mediachannelid = mc.id
LEFT OUTER JOIN lookup_cities lc
ON lc.id = mc.cityid
LEFT OUTER JOIN lookup_countries lcc
ON lcc.id = mc.countryid
LEFT OUTER JOIN itembylines ib
ON ib.itemid = i.id
LEFT OUTER JOIN bylines bl1
ON bl1.id = ib.bylineid
LEFT OUTER JOIN batchactionhistory bah
ON b.id = bah.batchid
INNER JOIN itemorganisationissues ioi
ON ioi.itemorganisationid = oi.id
INNER JOIN projectissues pri
ON pri.id = ioi.issueid
LEFT OUTER JOIN itemorganisationmessages iom
ON iom.itemorganisationid = oi.id
LEFT OUTER JOIN lookup_messages lm
ON iom.messageid = lm.id
LEFT OUTER JOIN lookup_messagetypes lmt
ON lmt.id = lm.messagetypeid
LEFT OUTER JOIN itemorganisationsources ios
ON ios.itemorganisationid = oi.id
LEFT OUTER JOIN bylines bl2
ON bl2.id = ios.bylineid
LEFT OUTER JOIN lookup_sourcetypes lst
ON lst.id = ios.sourcetypeid
WHERE p.id = #profileID
AND b.statusid IN ( 6, 7 )
AND bah.batchactionid = 6
AND i.statusid = 2
AND i.isrelevant = 1
when looking at the execution plan I can see an step which is costing 42%. Is there any way I could get this to a lower threshold or any way that I can improve the performance of the whole query.
Remove the profiles table as it is not needed and change the WHERE clause to
WHERE PR.profileid = #profileID
You have a left outer join on the batchactionhistory table but also have a condition in your WHERE clause which turns it back into an inner join. Change you code to this:
LEFT OUTER JOIN batchactionhistory bah
ON b.id = bah.batchid
AND bah.batchactionid = 6
You don't need the batches table as it is used to join other tables which could be joined directly and to show the id in you SELECT which is also available in other tables. Make the following changes:
i.batchidid AS BatchNo,
LEFT OUTER JOIN batchactionhistory bah
ON i.batchidid = bah.batchid
Are any of the fields that are used in joins or the WHERE clause from tables that contain large amounts of data but are not indexed. If so try adding an index on at time to the largest table.
Do you need every field in the result - if you could loose one or to you maybe could reduce the number of tables further.
First, if this is not a stored procedure, make it one. That's a lot of text for sql server to complile.
Next, my experience is that "worst practices" are occasionally a good idea. Specifically, I have been able to improve performance by splitting large queries into a couple or three small ones and assembling the results.
If this query is associated with a .net, coldfusion, java, etc application, you might be able to do the split/re-assemble in your application code. If not, a temporary table might come in handy.