WHERE in Sql, combining two fast conditions multiplies costs many times - sql

I have a fairly complex sql that returns 2158 rows' id from a table with ~14M rows. I'm using CTEs for simplification.
The WHERE consists of two conditions. If i comment out one of them, the other runs in ~2 second. If i leave them both (separated by OR) the query runs ~100 seconds. The first condition alone needs 1-2 seconds and returns 19 rows, the second condition alone needs 0 seconds and returns 2139 rows.
What can be the reason?
This is the complete SQL:
WITH fpcRepairs AS
(
SELECT FPC_Row = ROW_NUMBER()OVER(PARTITION BY t.SSN_Number ORDER BY t.Received_Date, t.Claim_Creation_Date, t.Repair_Completion_Date, t.Claim_Submitted_Date)
, idData, Repair_Completion_Date, Received_Date, Work_Order, SSN_number, fiMaxActionCode, idModel,ModelName
, SP=(SELECT TOP 1 Reused_Indicator FROM tabDataDetail td INNER JOIN tabSparePart sp ON td.fiSparePart=sp.idSparePart
WHERE td.fiData=t.idData
AND (td.Material_Quantity <> 0)
AND (sp.SparePartName = '1254-3751'))
FROM tabData AS t INNER JOIN
modModel AS m ON t.fiModel = m.idModel
WHERE (m.ModelName = 'LT26i')
AND EXISTS(
SELECT NULL
FROM tabDataDetail AS td
INNER JOIN tabSparePart AS sp ON td.fiSparePart = sp.idSparePart
WHERE (td.fiData = t.idData)
AND (td.Material_Quantity <> 0)
AND (sp.SparePartName = '1254-3751')
)
), needToChange AS
(
SELECT idData FROM tabData AS t INNER JOIN
modModel AS m ON t.fiModel = m.idModel
WHERE (m.ModelName = 'LT26i')
AND EXISTS(
SELECT NULL
FROM tabDataDetail AS td
INNER JOIN tabSparePart AS sp ON td.fiSparePart = sp.idSparePart
WHERE (td.fiData = t.idData)
AND (td.Material_Quantity <> 0)
AND (sp.SparePartName IN ('1257-2741','1257-2742','1248-2338','1254-7035','1248-2345','1254-7042'))
)
)
SELECT t.idData
FROM tabData AS t INNER JOIN modModel AS m ON t.fiModel = m.idModel
INNER JOIN needToChange ON t.idData = needToChange.idData -- needs to change FpcAssy
LEFT OUTER JOIN fpcRepairs rep ON t.idData = rep.idData
WHERE
rep.idData IS NOT NULL -- FpcAssy replaced, check if reused was claimed correctly
AND rep.FPC_Row > 1 -- other FpcAssy repair before
AND (
SELECT SP FROM fpcRepairs lastRep
WHERE lastRep.SSN_Number = rep.SSN_Number
AND lastRep.FPC_Row = rep.FPC_Row - 1
) = rep.SP -- same SP, must be rejected(reused+reused or new+new)
OR
rep.idData IS NOT NULL -- FpcAssy replaced, check if reused was claimed correctly
AND rep.FPC_Row = 1 -- no other FpcAssy repair before
AND rep.SP = 0 -- not reused, must be rejected
order by t.idData
Here's the execution plan:
Download: http://www.filedropper.com/exeplanfpc

Try to use UNION ALL of 2 queries separately instead of OR condition.
I've tried it many times and it really helped. I've read about this issue in Art Of SQL .
Read it, you can find many useful information about performance issues.
UPDATE:
Check related questions
UNION ALL vs OR condition in sql server query
http://www.sql-server-performance.com/2011/union-or-sql-server-queries/
Can UNION ALL be faster than JOINs or do my JOINs just suck?
Check Wes's answer
The usage of the OR is probably causing the query optimizer to no longer use an index in the second query.

Related

SQL Server Execution Plan Review Request

Having trouble understanding why my query is taking so long, looking for advice to optimise please.
update Laserbeak_Main.dbo.ACCOUNT_MPN set
DateUpgrade = ord.ConnectedDate
FROM [ORDER] ord
WHERE ord.AccountNumber = Laserbeak_Main.dbo.ACCOUNT_MPN.AccountNumber
AND ord.ordertypeID = '2'
AND ord.ConnectedDate IS NOT NULL
AND DateUpgrade <> ord.ConnectedDate
Execution plan as requested on brentozar.com
UPDATE: Following suggestions the new query looks like this & seems to work much more quickly. However if you run the query it sets the rows as expected, then run again it updates the exact same number of rows. Converting to a select confirms that the same rows are being updated each time. The <> clause should stop this but it doesn't. I believed it was something to do with collation but have been unable to confirm if its possible to have different collations at table level in the same database.
;WITH cteOrderInfo AS (
SELECT DISTINCT ord.AccountNumber, ord.ConnectedDate
FROM [ORDER] ord
WHERE ord.ordertypeID = '2'
AND ord.ConnectedDate IS NOT NULL
)
UPDATE Laserbeak_Main.dbo.ACCOUNT_MPN
SET Laserbeak_Main.dbo.ACCOUNT_MPN.DateUpgrade = cteOrderInfo.ConnectedDate
FROM cteOrderInfo
INNER JOIN Laserbeak_Main.dbo.ACCOUNT_MPN acc
ON cteOrderInfo.AccountNumber = acc.AccountNumber
WHERE cteOrderInfo.ConnectedDate <> acc.DateUpgrade
The SELECT to confirm:
;WITH cteOrderInfo AS (
SELECT DISTINCT ord.AccountNumber, ord.ConnectedDate
FROM [ORDER] ord
WHERE ord.ordertypeID = '2'
AND ord.ConnectedDate IS NOT NULL
)
SELECT cteOrderInfo.ConnectedDate, acc.DateUpgrade
FROM cteOrderInfo
INNER JOIN Laserbeak_Main.dbo.ACCOUNT_MPN acc
ON cteOrderInfo.AccountNumber = acc.AccountNumber
WHERE cteOrderInfo.ConnectedDate <> acc.DateUpgrade
SELECT Results Sample:
As Serge suggested, we did not have unique rows.
the solution we arrived at:
;WITH cteSourceStuff AS (
SELECT AccountNumber, MpnUpgrade, MAX(DateConnected) maxConnDate
FROM ORDER_DETAIL, [ORDER]
WHERE ORDER_DETAIL.OrderID = [ORDER].OrderID
AND LEN(MpnUpgrade) > 10
AND OrderTypeID = 2
GROUP BY AccountNumber, MpnUpgrade
)
UPDATE Laserbeak_Main.dbo.ACCOUNT_MPN set
DateUpgrade = cteSourceStuff.maxConnDate
FROM cteSourceStuff
WHERE cteSourceStuff.MpnUpgrade = ACCOUNT_MPN.Mpn
AND cteSourceStuff.AccountNumber = ACCOUNT_MPN.AccountNumber
AND DateUpgrade <> cteSourceStuff.maxConnDate
This works because the duplicates are initially removed, then we only update the rows that we are actually targeting. The reason we have issues before was that SQL was updating the 1st row it found, then when we re-ran or ran the select it was return rows matched on the key but that had not previously been updated.

How to find duplicates in a large table based on matching and non matching fields?

I have a very large table with more than 10 million records. I want to find duplicates based on some fields matching and some fields not matching in it.
The query currently I am using is below:
SELECT DISTINCT MainTable.[lineitemid]
FROM [dbo].[lineitem] MainTable
INNER JOIN [dbo].[lineitem] AS ChildTable
ON ChildTable.invoicedate = MainTable.invoicedate
AND LEFT(ChildTable.vendorname, 4) = LEFT(MainTable.vendorname, 4)
AND ChildTable.invoiceid <> MainTable.invoiceid AND -- Invoice ID column not matching
ChildTable.documentcurrencyamount = MainTable.documentcurrencyamount
WHERE ChildTable.lineitemid <> MainTable.lineitemid AND -- LineItemId is PK
MainTable.projectid = 1125 AND ChildTable.projectid = 1125 -- Duplicates should be identified with specific ProjectId
This query is working fine if the number of records for ProjectId is under 100,000.
When the ProjectId records are more than 1 million, while executing this query, the tempdb size shoots up to 100 GB and causing low disk space issues. The query is taking forever to execute.
Please help me in optimizing the query.
Added the below lines after getting answer for the above query....
Thanks a lot, #Gordon-Linoff. The query you suggested worked much faster. The VendorName is from a different table. Can I include a inner join as shown below?
SELECT li1.[LineItemId]
FROM [dbo].[LineItem] li1
INNER JOIN VendorMaster vm1 ON li1.VendorNumber=vm1.VendorNumber
AND vm1.CompanyCode = li1.CompanyCode
WHERE EXISTS (SELECT 1
FROM [dbo].[LineItem] as li2
INNER JOIN VendorMaster vm2 on li2.VendorNumber = vm2.VendorNumber
AND vm2.CompanyCode = li2.CompanyCode
WHERE li2.InvoiceDate = li.InvoiceDate and
LEFT(li2.VendorName, 4) = LEFT(li1.VendorName, 4) and
li2.InvoiceId <> li1.InvoiceId and -- Invoice ID column not matching
li2.DocumentCurrencyAmount = li1.DocumentCurrencyAmount and
li2.LineItemId <> li1.LineItemId and
li2.ProjectId = li1.ProjectId
li2.VendorNumber = li.VendorNumber)
AND li.ProjectId = 1125
Is it an efficient approach?
A less expensive way to run this query is to use exists and dispense with the distinct:
SELECT li.[LineItemId]
FROM [dbo].[LineItem] li
WHERE EXISTS (SELECT 1
FROM [dbo].[LineItem] as li2 on
WHERE li2.InvoiceDate = li.InvoiceDate and
LEFT(li2.VendorName, 4) = LEFT(li.VendorName, 4) and
li2.InvoiceId <> li.InvoiceId and -- Invoice ID column not matching
li2.DocumentCurrencyAmount = li.DocumentCurrencyAmount and
li2.LineItemId <> li.LineItemId and
li2.ProjectId = li.ProjectId
WHERE MainTable.ProjectId = 1125;
For performance, an index on LineItem(ProjectId, InvoiceDate, DocumentCurrencyAmount, VendorName, InvoiceId, LineItemId) would help. You could further speed the query by declaring LEFT(LineItem.VendorName, 4) as a computed column and adding it to the index before VendorName.

Confused in join query in SQL

The following works:
SELECT IBAD.TRM_CODE, IBAD.IPABD_CUR_QTY, BM.BOQ_ITEM_NO,
IBAD.BCI_CODE, BCI.BOQ_CODE
FROM IPA_BOQ_ABSTRCT_DTL IBAD,
BOQ_CONFIG_INF BCI,BOQ_MST BM
WHERE BM.BOQ_CODE = BCI.BOQ_CODE
AND BCI.BCI_CODE = IBAD.BCI_CODE
AND BCI.STATUS = 'Y'
AND BM.STATUS = 'Y'
order by boq_item_no;
Results:
But after joining many tables with that query, the result is confusing:
SELECT (SELECT CMN_NAME
FROM CMN_MST
WHERE CMN_CODE= BRI.CMN_RLTY_MTRL) MTRL,
RRI.RRI_RLTY_RATE AS RATE,
I.BOQ_ITEM_NO,
(TRIM(TO_CHAR(IBAD.IPABD_CUR_QTY,
'9999999999999999999999999999990.999'))) AS IPABD_CUR_QTY,
TRIM(TO_CHAR(BRI.BRI_WT_FACTOR,
'9999999999999999999999999999990.999')) AS WT,
TRIM(TO_CHAR((IBAD.IPABD_CUR_QTY*BRI.BRI_WT_FACTOR),
'9999999999999999999999990.999')) AS RLTY_QTY,
(TRIM(TO_CHAR((IBAD.IPABD_CUR_QTY*BRI.BRI_WT_FACTOR*RRI.RRI_RLTY_RATE),
'9999999999999999999999990.99'))) AS TOT_AMT,
I.TRM_CODE AS TRM
FROM
(SELECT * FROM ipa_boq_abstrct_dtl) IBAD
INNER JOIN
(SELECT * FROM BOQ_RLTY_INF) BRI
ON IBAD.BCI_CODE = BRI.BCI_CODE
INNER JOIN
(SELECT * FROM RLTY_RATE_INF) RRI
ON BRI.CMN_RLTY_MTRL = RRI.CMN_RLTY_MTRL
INNER JOIN
( SELECT IBAD.TRM_CODE, IBAD.IPABD_CUR_QTY,
BM.BOQ_ITEM_NO, IBAD.BCI_CODE, BCI.BOQ_CODE
FROM IPA_BOQ_ABSTRCT_DTL IBAD,
BOQ_CONFIG_INF BCI,BOQ_MST BM
WHERE
BM.BOQ_CODE = BCI.BOQ_CODE
AND BCI.BCI_CODE = IBAD.BCI_CODE
and BCI.status = 'Y'
and bm.status = 'Y') I
ON BRI.BCI_CODE = I.BCI_CODE
AND I.TRM_CODE = BRI.TRM_CODE
AND BRI.TRM_CODE =4
group by BRI.CMN_RLTY_MTRL, RRI.RRI_RLTY_RATE, I.BOQ_ITEM_NO,
IBAD.IPABD_CUR_QTY, BRI.BRI_WT_FACTOR, I.TRM_CODE, I.bci_code
order by BRI.CMN_RLTY_MTRL
Results:
TRM should be 11 instead of 4 in the first row.
you getting 4 because you use
AND BRI.TRM_CODE =4
if you remove this criter you can get true result
In your first query, both of the rows you've highlighted have BCI_CODE=1866.
In the second query, you are joining that result set with a number of others (which come from the same tables, which seems odd). In particular, you are joining from the subquery to another table using BCI_CODE, and from there to (SELECT * FROM ipa_boq_abstrct_dtl) IBAD. Since both of the rows from the subquery have the same BCI_CODE, they will join to the same rows in the other tables.
The quantity that you are actually displaying in the second query is from (SELECT * FROM ipa_boq_abstrct_dtl) IBAD, not from the other subquery.
Is the problem simply that you mean to select I.IPABD_CUR_QTY instead of IBAD.IPABD_CUR_QTY?
You might find this clearer if you did not reuse the same aliases for tables at multiple points in the query.

Compare values from one table with the results from a query?

First, I will explain the what is being captured. User's have a member level associated with their accounts (Bronze, Gold, Diamond, etc). A nightly job needs to run to calculate the orders from today a year back. If the order total for a given user goes over or under a certain amount their level is upgraded or downgraded. The table where the level information is stored will not change much, but the minimum and maximum amount thresholds may over time. This is what the table looks like:
CREATE TABLE [dbo].[MemberAdvantageLevels] (
[Id] int NOT NULL IDENTITY(1,1) ,
[Name] varchar(255) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[MinAmount] int NOT NULL ,
[MaxAmount] int NOT NULL ,
CONSTRAINT [PK__MemberAd__3214EC070D9DF1C7] PRIMARY KEY ([Id])
)
ON [PRIMARY]
GO
I wrote a query that will group the orders by user for the year to date. The query includes their current member level.
SELECT
Sum(dbo.tbh_Orders.SubTotal) AS OrderTotals,
Count(dbo.UserProfile.UserId) AS UserOrders,
dbo.UserProfile.UserId,
dbo.UserProfile.UserName,
dbo.UserProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent as IsCurrentLevel,
dbo.MemberAdvantageLevels.Id as MemberLevelId,
FROM
dbo.tbh_Orders
INNER JOIN dbo.tbh_OrderStatuses ON dbo.tbh_Orders.StatusID = dbo.tbh_OrderStatuses.OrderStatusID
INNER JOIN dbo.UserProfile ON dbo.tbh_Orders.CustomerID = dbo.UserProfile.UserId
INNER JOIN dbo.UserMemberAdvantageLevels ON dbo.UserProfile.UserId = dbo.UserMemberAdvantageLevels.UserId
INNER JOIN dbo.MemberAdvantageLevels ON dbo.UserMemberAdvantageLevels.MemberAdvantageLevelId = dbo.MemberAdvantageLevels.Id
WHERE
dbo.tbh_OrderStatuses.OrderStatusID = 4 AND
(dbo.tbh_Orders.AddedDate BETWEEN dateadd(year,-1,getdate()) AND GETDATE()) and IsCurrent = 1
GROUP BY
dbo.UserProfile.UserId,
dbo.UserProfile.UserName,
dbo.UserProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent,
dbo.MemberAdvantageLevels.Id
So, I need to check the OrdersTotal and if it exceeds the current level threshold, I then need to find the Level that fits their current order total and create a new record with their new level.
So for example, lets say jon#doe.com currently is at bronze. The MinAmount for bronze is 0 and the MaxAmount is 999. Currently his Orders for the year are at $2500. I need to find the level that $2500 fits within and upgrade his account. I also need to check their LevelAchievmentDate and if it is outside of the current year we may need to demote the user if there has been no activity.
I was thinking I could create a temp table that holds the results of all levels and then somehow create a CASE statement in the query above to determine the new level. I don't know if that is possible. Or, is it better to iterate over my order results and perform additional queries? If I use the iteration pattern I know i can use the When statement to iterate over the rows.
Update
I updated my Query A bit and so far came up with this, but I may need more information than just the ID from the SubQuery
Select * into #memLevels from MemberAdvantageLevels
SELECT
Sum(dbo.tbh_Orders.SubTotal) AS OrderTotals,
Count(dbo.AZProfile.UserId) AS UserOrders,
dbo.AZProfile.UserId,
dbo.AZProfile.UserName,
dbo.AZProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent as IsCurrentLevel,
dbo.MemberAdvantageLevels.Id as MemberLevelId,
(Select Id from #memLevels where Sum(dbo.tbh_Orders.SubTotal) >= #memLevels.MinAmount and Sum(dbo.tbh_Orders.SubTotal) <= #memLevels.MaxAmount) as NewLevelId
FROM
dbo.tbh_Orders
INNER JOIN dbo.tbh_OrderStatuses ON dbo.tbh_Orders.StatusID = dbo.tbh_OrderStatuses.OrderStatusID
INNER JOIN dbo.AZProfile ON dbo.tbh_Orders.CustomerID = dbo.AZProfile.UserId
INNER JOIN dbo.UserMemberAdvantageLevels ON dbo.AZProfile.UserId = dbo.UserMemberAdvantageLevels.UserId
INNER JOIN dbo.MemberAdvantageLevels ON dbo.UserMemberAdvantageLevels.MemberAdvantageLevelId = dbo.MemberAdvantageLevels.Id
WHERE
dbo.tbh_OrderStatuses.OrderStatusID = 4 AND
(dbo.tbh_Orders.AddedDate BETWEEN dateadd(year,-1,getdate()) AND GETDATE()) and IsCurrent = 1
GROUP BY
dbo.AZProfile.UserId,
dbo.AZProfile.UserName,
dbo.AzProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent,
dbo.MemberAdvantageLevels.Id
This hasn't been syntax checked or tested but should handle the inserts and updates you describe. The insert can be done as single statement using a derived/virtual table which contains the orders group by caluclation. Note that both the insert and update statement be done within the same transaction to ensure no two records for the same user can end up with IsCurrent = 1
INSERT UserMemberAdvantageLevels (UserId, MemberAdvantageLevelId, IsCurrent,
LevelAchiementAmount, LevelAchievmentDate)
SELECT t.UserId, mal.Id, 1, t.OrderTotals, GETDATE()
FROM
(SELECT ulp.UserId, SUM(ord.SubTotal) OrderTotals, COUNT(ulp.UserId) UserOrders
FROM UserLevelProfile ulp
INNER JOIN tbh_Orders ord ON (ord.CustomerId = ulp.UserId)
WHERE ord.StatusID = 4
AND ord.AddedDate BETWEEN DATEADD(year,-1,GETDATE()) AND GETDATE()
GROUP BY ulp.UserId) AS t
INNER JOIN MemberAdvantageLevels mal
ON (t.OrderTotals BETWEEN mal.MinAmount AND mal.MaxAmount)
-- Left join needed on next line in case user doesn't currently have a level
LEFT JOIN UserMemberAdvantageLevels umal ON (umal.UserId = t.UserId)
WHERE umal.MemberAdvantageLevelId IS NULL -- First time user has been awarded a level
OR (mal.Id <> umal.MemberAdvantageLevelId -- Level has changed
AND (t.OrderTotals > umal.LevelAchiementAmount -- Acheivement has increased (promotion)
OR t.UserOrders = 0)) -- No. of orders placed is zero (de-motion)
/* Reset IsCurrent flag where new record has been added */
UPDATE UserMemberAdvantageLevels
SET umal1.IsCurrent=0
FROM UserMemberAdvantageLevels umal1
INNER JOIN UserMemberAdvantageLevels umal2 On (umal2.UserId = umal1.UserId)
WHERE umal1.IsCurrent = 1
AND umal2.IsCurrent = 2
AND umal1.LevelAchievmentDate < umal2.LevelAchievmentDate)
One approach:
with cte as
(SELECT Sum(o.SubTotal) AS OrderTotals,
Count(p.UserId) AS UserOrders,
p.UserId,
p.UserName,
p.Email,
l.Name,
l.MinAmount,
l.MaxAmount,
ul.LevelAchievmentDate,
ul.LevelAchiementAmount,
ul.IsCurrent as IsCurrentLevel,
l.Id as MemberLevelId
FROM dbo.tbh_Orders o
INNER JOIN dbo.UserProfile p ON o.CustomerID = p.UserId
INNER JOIN dbo.UserMemberAdvantageLevels ul ON p.UserId = ul.UserId
INNER JOIN dbo.MemberAdvantageLevels l ON ul.MemberAdvantageLevelId = l.Id
WHERE o.StatusID = 4 AND
o.AddedDate BETWEEN dateadd(year,-1,getdate()) AND GETDATE() and
IsCurrent = 1
GROUP BY
p.UserId, p.UserName, p.Email, l.Name, l.MinAmount, l.MaxAmount,
ul.LevelAchievmentDate, ul.LevelAchiementAmount, ul.IsCurrent, l.Id)
select cte.*, ml.*
from cte
join #memLevels ml
on cte.OrderTotals >= ml.MinAmount and cte.OrderTotals <= ml.MaxAmount

SQL Server "ORDER BY" optimization - massive performance decrease

Using SQL Server 2000. I have a table that receives a dump from a legacy system once a day, I am trying to write a query that will process this table with a few reference table joins and an order by clause.
This is the SQL I have:
select d.acct_no,
d.associate_id,
d.first_name,
d.last_name,
d.acct_bal,
plr.long_name p_lvl,
tlr.long_name t_lvl,
d.category,
d.status,
tm.site_name,
d.addr1 + ' ' + isnull(d.addr2,'') address,
d.city,
d.state,
d.country,
d.post_code,
CASE WHEN d.home_phone_ok = 1 THEN d.home_phone END home_phone,
CASE WHEN d.work_phone_ok = 1 THEN d.work_phone END work_phone,
CASE WHEN d.alt_phone_ok = 1 THEN d.alt_phone END alt_phone,
CASE WHEN d.email_ok = 1 THEN d.email END email,
d.last_credit last_paid,
d.service,
d.quantity,
d.amount,
ar.area_desc area
from item_dump d
left outer join territory_map tm on tm.short_postcode = left(post_code,3) and country in ('United States','Canada')
left outer join p_level_ref plr on plr.p_level_id = d.p_lvl_id
left outer join t_level_ref tlr on tlr.t_level_id = d.t_lvl_id
left outer join (select distinct master_item_id, site_item_id from invoice_detail) as map on map.item_id = d.item_no
left outer join item_ref i on i.item_id = map.master_item_id
left outer join area_ref ar on ar.area_id = i.area_id
where (d.cat_id > 80 or d.cat_id < 70)
and d.standing < 4
and d.status not like 'DECEASED'
and d.paid = 1
order by d.associate_id
Most of these columns are straight from the legacy system dump table item_dump. All the joins are only reference tables with few rows. The legacy table itself has about 17000 records but with the where statements the query comes out to 3000.
I have a non-clustered index on the associate_id column.
When I run this query without the order by associate_id clause it takes about 2 seconds. With the order by clause it takes a full minute!
I've tried adding the where clause columns to the index along with associate_id but that didn't change the performance at all.
The end of the execution plan without the order by looks like this:
Using order by, parallelism kicks in on the order by argument and it looks like this:
I thought maybe it was weird SQL Server 2000 parallelism handling, but adding the (maxdop 1) hint made the query take 3 minutes instead!
It isn't really sensible for me to put sorting in the application code because this query caches for about 6 hours before it gets run again and I would have to sort it in the application code many times a minute.
I must be missing something very basic but after straining at the query for an hour i.e. running it 10 times, I can't see what it is anymore.
What happens when u remove all the outer joins and ofcourse the select's in there..
select d.acct_no,
d.associate_id,
d.first_name,
d.last_name,
d.acct_bal,
d.category,
d.status,
d.addr1 + ' ' + isnull(d.addr2,'') address,
d.city,
d.state,
d.country,
d.post_code,
CASE WHEN d.home_phone_ok = 1 THEN d.home_phone END home_phone,
CASE WHEN d.work_phone_ok = 1 THEN d.work_phone END work_phone,
CASE WHEN d.alt_phone_ok = 1 THEN d.alt_phone END alt_phone,
CASE WHEN d.email_ok = 1 THEN d.email END email,
d.last_credit last_paid,
d.service,
d.quantity,
d.amount
from item_dump d
where (d.cat_id > 80 or d.cat_id < 70)
and d.standing < 4
and d.status not like 'DECEASED'
and d.paid = 1
order by d.associate_id
If that works fast then i would go for sub selects inside the select's
select d.acct_no,
d.associate_id,
d.first_name,
d.last_name,
d.acct_bal,
plr.long_name p_lvl,
tlr.long_name t_lvl,
d.category,
d.status,
(select tm.site_name
from territory_map tm
where tm.short_postcode = left(post_code,3)
and country in ('United States','Canada') as site_name
etc. it'll be really faster as left outer joining them in the from clause