Having trouble understanding why my query is taking so long, looking for advice to optimise please.
update Laserbeak_Main.dbo.ACCOUNT_MPN set
DateUpgrade = ord.ConnectedDate
FROM [ORDER] ord
WHERE ord.AccountNumber = Laserbeak_Main.dbo.ACCOUNT_MPN.AccountNumber
AND ord.ordertypeID = '2'
AND ord.ConnectedDate IS NOT NULL
AND DateUpgrade <> ord.ConnectedDate
Execution plan as requested on brentozar.com
UPDATE: Following suggestions the new query looks like this & seems to work much more quickly. However if you run the query it sets the rows as expected, then run again it updates the exact same number of rows. Converting to a select confirms that the same rows are being updated each time. The <> clause should stop this but it doesn't. I believed it was something to do with collation but have been unable to confirm if its possible to have different collations at table level in the same database.
;WITH cteOrderInfo AS (
SELECT DISTINCT ord.AccountNumber, ord.ConnectedDate
FROM [ORDER] ord
WHERE ord.ordertypeID = '2'
AND ord.ConnectedDate IS NOT NULL
)
UPDATE Laserbeak_Main.dbo.ACCOUNT_MPN
SET Laserbeak_Main.dbo.ACCOUNT_MPN.DateUpgrade = cteOrderInfo.ConnectedDate
FROM cteOrderInfo
INNER JOIN Laserbeak_Main.dbo.ACCOUNT_MPN acc
ON cteOrderInfo.AccountNumber = acc.AccountNumber
WHERE cteOrderInfo.ConnectedDate <> acc.DateUpgrade
The SELECT to confirm:
;WITH cteOrderInfo AS (
SELECT DISTINCT ord.AccountNumber, ord.ConnectedDate
FROM [ORDER] ord
WHERE ord.ordertypeID = '2'
AND ord.ConnectedDate IS NOT NULL
)
SELECT cteOrderInfo.ConnectedDate, acc.DateUpgrade
FROM cteOrderInfo
INNER JOIN Laserbeak_Main.dbo.ACCOUNT_MPN acc
ON cteOrderInfo.AccountNumber = acc.AccountNumber
WHERE cteOrderInfo.ConnectedDate <> acc.DateUpgrade
SELECT Results Sample:
As Serge suggested, we did not have unique rows.
the solution we arrived at:
;WITH cteSourceStuff AS (
SELECT AccountNumber, MpnUpgrade, MAX(DateConnected) maxConnDate
FROM ORDER_DETAIL, [ORDER]
WHERE ORDER_DETAIL.OrderID = [ORDER].OrderID
AND LEN(MpnUpgrade) > 10
AND OrderTypeID = 2
GROUP BY AccountNumber, MpnUpgrade
)
UPDATE Laserbeak_Main.dbo.ACCOUNT_MPN set
DateUpgrade = cteSourceStuff.maxConnDate
FROM cteSourceStuff
WHERE cteSourceStuff.MpnUpgrade = ACCOUNT_MPN.Mpn
AND cteSourceStuff.AccountNumber = ACCOUNT_MPN.AccountNumber
AND DateUpgrade <> cteSourceStuff.maxConnDate
This works because the duplicates are initially removed, then we only update the rows that we are actually targeting. The reason we have issues before was that SQL was updating the 1st row it found, then when we re-ran or ran the select it was return rows matched on the key but that had not previously been updated.
Related
I have a table that I do some joins and operations on. This table has about 150,000 rows and if I select all and run it, it returns in about 10 seconds. If I create my query into its own table, and filter out all the rows where a certain field is null, now the query takes 10 minutes to run. Is it suppoused to be like this or is there any way to fix it? Here is the query.
SELECT *
FROM
(
Select
I.Date_Created
,I.Company_Code
,I.Division_Code
,I.Invoice_Number
,Sh.CUST_PO
,I.Total_Quantity
,ID.Total
,SH.Ship_City City
,CASE WHEN SH.Ship_Cntry <> 'US' THEN 'INT' ELSE SH.Ship_prov END State
,SH.Ship_Zip Zip
,SH.Ship_Cntry Country
,S.CustomerEmail
from [JMNYC-AMTDB].[AMTPLUS].[dbo].Invoices I (nolock)
LEFT JOIN (SELECT
ID.Company_Code
,ID.Division_Code
,ID.Invoice_Number
,SUM (ID.Price* ID.Quantity) Total
FROM [JMNYC-AMTDB].[AMTPLUS].[dbo].Invoices_Detail ID (nolock)
GROUP BY ID.Company_Code, ID.Division_Code, ID.Invoice_Number) ID
ON I.Company_Code = ID.Company_Code
AND I.Division_Code = ID.Division_Code
AND I.Invoice_Number = ID.Invoice_Number
LEFT JOIN
[JMDNJ-ACCELSQL].[A1WAREHOUSE].[dbo].SHIPHIST SH (nolock) ON I.Pickticket_Number = SH.Packslip
LEFT JOIN
[JMDNJ-ACCELSQL].[A1WAREHOUSE].[dbo].[MagentoCustomerEmailData] S on SH.CUST_PO = S.InvoiceNumber
Where I.Company_Code ='09' AND I.Division_Code = '001'
AND I.Customer_Number = 'ECOM2X'
)T
Where T.CustomerEmail IS NOT NULL -- This is the problematic line
Order By T.Date_Created desc
If you are aware of the Index Considerations and you are sure about the problem point, then you can use this to improve it:
USE A1WAREHOUSE;
GO
CREATE NONCLUSTERED INDEX IX_MagentoCustomerEmailData_CustomerEmail
ON [dbo].[MagentoCustomerEmailData] (CustomerEmail ASC);
GO
Totally, you need to add index on columns used in ORDER BY, WHERE, GROUP BY, ON etc sections. Before adding indexes be sure that you are aware of the consequences.
Read more about Index:
https://www.mssqltips.com/sqlservertutorial/9133/sql-server-nonclustered-indexes/
https://www.itprotoday.com/sql-server/indexing-dos-and-don-ts
Let me open with:
SHOWPLAN permission denied in database 'MyDatabase'.
With that out of the way, I'll layout my situation.
So, The database I work with has a view that executes fairly quickly.
SELECT * FROM MyView
returns 32 rows in 1 second and includes a non-indexed column of values (IDs) I need to filter on.
If I filter on these IDs directly in the view:
SELECT * FROM MyView WHERE MyView.SomeId = 18
Things slow immensely and it takes 21 seconds to return the 20 rows with that ID.
As an experiment I pushed the unfiltered results into a temporary table and executed the filtered query on the the temporary table:
IF OBJECT_ID('tempdb..#TEMP_TABLE') IS NOT NULL
BEGIN
DROP TABLE #TEMP_TABLE
END
SELECT * INTO #TEMP_TABLE
FROM MyView;
SELECT *
FROM #TEMP_TABLE
WHERE #TEMP_TABLE.SomeId = 18
DROP TABLE #TEMP_TABLE
And found that it returns the filtered results far faster (roughly 1 second)
Is there a cleaner syntax or pattern that can be implemented to achieve the same performance?
UPDATE: View Definition and Description
Manually obfuscated, but I was careful so hopefully there aren't many errors. Still waiting on SHOWPLAN permissions, so Execution Plan is still pending.
The view's purpose is to provide a count of all the records that belong to a specific component (CMP.COMPONENT_ID = '100') grouped by location.
"Belonging" is determined by the record's PROC_CODE (mapped through PROC_ID) being within the CMP's inclusion range (CMP_INCs) and not in the CMP's exclusion range (CMP_EXCs).
In practice, exclusion ranges are created for individual codes (the bounds are always equal) making it sufficient to check that the code is not equal a bound.
PROC_CODES can (and don't always) have an alphabetic prefix or suffix, which makes the ISNUMERIC() comparison necessary.
Records store PROC_IDs for their PROC_CODEs, so it's necessary to convert the CMP's PROC_CODE ranges into a set of PROC_IDs for identifying which records belong to that component
The performance issue occurs when trying to filter by DEPARTMENT_ID or LOCATION_ID
[CO_RECORDS] is also a view, but if it's that deep I'm going turf this to someone with less red tape to fight through.
CREATE VIEW [ViewsSchema].[MyView] AS
WITH
CMP_INCs AS (SELECT RNG.*, COALESCE(RNG.RANGE_END, RNG.RANGE_BEG) [SAFE_END] FROM DBEngine.DBO.DB_CMP_RANGE [RNG] WHERE [RNG].COMPONENT_ID = '100'),
CMP_EXCs AS (SELECT CER.* FROM DBEngine.DBO.DB_CMP_EXC_RANGE CER WHERE CER.COMPONENT_ID = '100'),
CMP_PROC_IDs AS (
SELECT
DBEngine_ProcTable.PROC_ID [CMP_PROC_ID],
DBEngine_ProcTable.PROC_CODE [CMP_PROC_CODE],
DB_CmpTable.COMPONENT_ID [CMP_ID],
MAX(DB_CmpTable.COMPONENT_NAME) [CMP_NAME]
FROM [DBEngine].DBO.DBEngine_ProcTable DBEngine_ProcTable
LEFT JOIN CMP_INCs ON ISNUMERIC(DBEngine_ProcTable.PROC_CODE) = ISNUMERIC(CMP_INCs.RANGE_BEG)
AND(DBEngine_ProcTable.PROC_CODE = CMP_INCs.RANGE_BEG
OR DBEngine_ProcTable.PROC_CODE BETWEEN CMP_INCs.RANGE_BEG AND CMP_INCs.SAFE_END)
INNER JOIN DBEngine.DBO.DB_CmpTable ON CMP_INCs.COMPONENT_ID = DB_CmpTable.COMPONENT_ID
LEFT JOIN CMP_EXCs EXCS ON EXCS.COMPONENT_ID = DB_CmpTable.COMPONENT_ID AND EXCS.EXCL_RANGE_END = DBEngine_ProcTable.PROC_CODE
WHERE EXCS.EXCL_RANGE_BEG IS NULL
GROUP BY
DBEngine_ProcTable.PROC_ID,
DBEngine_ProcTable.PROC_CODE,
DBEngine_ProcTable.BILL_DESC,
DBEngine_ProcTable.PROC_NAME,
DB_CmpTable.COMPONENT_ID
)
SELECT
RECORD.LOCATION_NAME [LOCATION_NAME]
, RECORD.LOCATION_ID [LOCATION_ID]
, MAX(RECORD.[Department]) [DEPARTMENT]
, RECORD.[Department ID] [DEPARTMENT_ID]
, SUM(RECORD.PROCEDURE_QUANTITY) [PROCEDURE_COUNT]
FROM DBEngineCUSTOMRPT.ViewsSchema.CO_RECORDS [RECORDS]
INNER JOIN CMP_PROC_IDs [CV] ON [CV].CMP_PROC_ID = [RECORDS].PROC_ID
CROSS JOIN (SELECT DATEADD(M, DATEDIFF(M, 0,GETDATE()), 0) [FIRSTOFTHEMONTH]) VARS
WHERE [RECORDS].TYPE = 1
AND ([RECORDS].VOID_DATE IS NULL OR [RECORDS].VOID_DATE >= VARS.[FIRSTOFTHEMONTH] )
AND [RECORDS].POST_DATE < VARS.[FIRSTOFTHEMONTH]
AND [RECORDS].DOS_MONTHS_BACK = 2
GROUP BY [RECORDS].LOCATION_NAME, [RECORDS].[Department ID]
GO
Based on the swift down votes, the answer to my question is
'No, there is not a clean syntax based solution for the improved
performance, and asking for one is ignorant of the declarative nature
of SQL you simple dirty plebeian'.
From the requests for the view's definition, it's clear that performance issues in simple queries should be addressed by fixing the structure of the objects being queried ('MyView' in this case) rather than syntactical gymnastics.
For interested parties the issue was resolved by adding a Row_Number() column to the final select in the view definition, wrapping it in a CTE, and using the new column in an always true filter while selecting the original columns.
I have no idea if this is the optimal solution. It doesn't feel good to me, but it appears to be working.
CREATE VIEW [ViewsSchema].[MyView] AS
WITH
CMP_INCs AS (SELECT RNG.*, COALESCE(RNG.RANGE_END, RNG.RANGE_BEG) [SAFE_END] FROM DBEngine.DBO.DB_CMP_RANGE [RNG] WHERE [RNG].COMPONENT_ID = '100'),
CMP_EXCs AS (SELECT CER.* FROM DBEngine.DBO.DB_CMP_EXC_RANGE CER WHERE CER.COMPONENT_ID = '100'),
CMP_PROC_IDs AS (
SELECT
DBEngine_ProcTable.PROC_ID [CMP_PROC_ID],
DBEngine_ProcTable.PROC_CODE [CMP_PROC_CODE],
DB_CmpTable.COMPONENT_ID [CMP_ID],
MAX(DB_CmpTable.COMPONENT_NAME) [CMP_NAME]
FROM [DBEngine].DBO.DBEngine_ProcTable DBEngine_ProcTable
LEFT JOIN CMP_INCs ON ISNUMERIC(DBEngine_ProcTable.PROC_CODE) = ISNUMERIC(CMP_INCs.RANGE_BEG)
AND(DBEngine_ProcTable.PROC_CODE = CMP_INCs.RANGE_BEG
OR DBEngine_ProcTable.PROC_CODE BETWEEN CMP_INCs.RANGE_BEG AND CMP_INCs.SAFE_END)
INNER JOIN DBEngine.DBO.DB_CmpTable ON CMP_INCs.COMPONENT_ID = DB_CmpTable.COMPONENT_ID
LEFT JOIN CMP_EXCs EXCS ON EXCS.COMPONENT_ID = DB_CmpTable.COMPONENT_ID AND EXCS.EXCL_RANGE_END = DBEngine_ProcTable.PROC_CODE
WHERE EXCS.EXCL_RANGE_BEG IS NULL
GROUP BY
DBEngine_ProcTable.PROC_ID,
DBEngine_ProcTable.PROC_CODE,
DBEngine_ProcTable.BILL_DESC,
DBEngine_ProcTable.PROC_NAME,
DB_CmpTable.COMPONENT_ID
),
RESULTS as (
SELECT
RECORD.LOCATION_NAME [LOCATION_NAME]
, RECORD.LOCATION_ID [LOCATION_ID]
, MAX(RECORD.[Department]) [DEPARTMENT]
, RECORD.[Department ID] [DEPARTMENT_ID]
, SUM(RECORD.PROCEDURE_QUANTITY) [PROCEDURE_COUNT]
, ROW_NUMBER() OVER (ORDER BY TDL.[Medical Department ID], TDL.[BILL_AREA_ID], TDL.JP_POS_NAME) [ROW]
FROM DBEngineCUSTOMRPT.ViewsSchema.CO_RECORDS [RECORDS]
INNER JOIN CMP_PROC_IDs [CV] ON [CV].CMP_PROC_ID = [RECORDS].PROC_ID
CROSS JOIN (SELECT DATEADD(M, DATEDIFF(M, 0,GETDATE()), 0) [FIRSTOFTHEMONTH]) VARS
WHERE [RECORDS].TYPE = 1
AND ([RECORDS].VOID_DATE IS NULL OR [RECORDS].VOID_DATE >= VARS.[FIRSTOFTHEMONTH] )
AND [RECORDS].POST_DATE < VARS.[FIRSTOFTHEMONTH]
AND [RECORDS].DOS_MONTHS_BACK = 2
GROUP BY [RECORDS].LOCATION_NAME, [RECORDS].[Department ID]
)
SELECT
[LOCATION_NAME]
, [LOCATION_ID]
, [DEPARTMENT]
, [DEPARTMENT_ID]
, [PROCEDURE_COUNT]
FROM RESULTS
WHERE [ROW] > 0
GO
I have the following query:
SELECT
F.IID,
F.E_NUM AS M_E_NUM,
MCI.E_NUM AS MCI_E_NUM,
F.C_NUM AS M_C_NUM,
MCI.C_NUM AS MCI_C_NUM,
F.ET_ID AS M_ET_ID,
EDIE.ET_ID AS ED_INDV_ET_ID,
COUNT(*) OVER (PARTITION BY F.IID) IID_COUNT
FROM FT_T F JOIN CEMEI_T MCI ON F.IID = MCI.IID
JOIN EDE_T EDE ON MCI.E_NUM = EDE.E_NUM
JOIN EDIE_T EDIE ON EDIE.IID = F.IID AND EDIE.ET_ID = EDE.ET_ID
WHERE
F.DEL_F = 'N'
AND MCI.EFF_END_DT IS NULL
AND MCI.TOS = 'BVVB'
AND EDE.PTEND_DT IS NULL
AND EDE.DEL_S = 'N'
AND EDE.CUR_IND = 'A'
AND EDIE.TAR_N = 'Y'
AND F.IID IN
(
SELECT DISTINCT IID
FROM FT_T
WHERE GROUP_ID = 'BG'
AND DEL_F = 'N'
AND (IID, E_NUM) NOT IN
(
SELECT IID, E_NUM FROM CEMEI_T
WHERE TOS = 'BVVB' AND EFF_END_DT IS NULL
)
);
I am basically grabbing information from several tables and creating a flat record of them.
Everything works accordingly except now I need to find out whether there are two records in FT_T table with identical IID's and display that count as part of the result set.
I tried to use partitioning but all the rows in the result set return a single count even though there are ones that have 2 records with identical IID's in FT_T.
The reason I initially said that I'm gathering information from several tables is due to the fact that FT_T might not have all the information I need if two records are not available for the same IID, so I have to retrieve them from other tables JOINed in the query. However, I need to know which FT_T.IID's have two records in FT_T (or greater than one).
Perhaps you need to calculate the count before the join and filtering:
SELECT . . .
FROM (SELECT F.*,
COUNT(*) OVER (PARTITION BY F.IID) as IID_CNT
FROM FT_T F
) JOIN
CEMEI_T MCI
ON F.IID = MCI.IID JOIN
EDE_T EDE
ON MCI.E_NUM = EDE.E_NUM JOIN
EDIE_T EDIE
ON EDIE.IID = F.IID AND EDIE.ET_ID = EDE.ET_ID
. . .
this is merely a comment/observation, but formatting is needed
You use of in(...) with select distinct and not in(...,...) seems complex and could be a problem if some values are NULL. I suggest you consider using EXISTS and NOT EXISTS instead. e.g.
AND EXISTS (
SELECT
NULL
FROM FT_T
WHERE F.IID = FT_T.IID
AND FT_T.GROUP_ID = 'BG'
AND FT_T.DEL_F = 'N'
AND NOT EXISTS (
SELECT
NULL
FROM CEMEI_T
WHERE FT_T.IID = CEMEI_T.IID
AND FT_T.E_NUM = CEMEI_T.E_NUM
AND CEMEI_T.TOS = 'BVVB'
AND CEMEI_T.EFF_END_DT IS NULL
)
)
I am trying to execute updates while my condition returns results, the problem is that when i am testing the query it never finishes.
Here is the query;
While(select COUNT(*) from Agreement as agr where agr.Id in (
select toa.Id from Agreement_TemporaryOnceAgreement as toa where toa.Executed =1)
and agr.EndingDate is null) > 0
begin
DECLARE #AgreementID int;
SET #AgreementID =
(
select top 1 agr.id from Agreement as agr where agr.Id in (
select toa.Id from Agreement_TemporaryOnceAgreement as toa where toa.Executed =1)
and agr.EndingDate is null
)
update Agreement SET EndingDate = (
select tado.Date from TemporaryAgreementsDateOfExecution tado
where tado.AgreementId = CAST(#AgreementID AS INT))
where Agreement.Id = CAST(#AgreementID AS INT);
end;
You don't need a loop. A single update query resembling this should get the job done.
update a
set EndingDate = tado.date
from Agreement a join TemporaryAgreementsDateOfExecution tado
on a.AgreementId = tado.AgreementId
join Agreement_TemporaryOnceAgreement toa
on a.Id = toa.id
where EndingDate is null
and toa.Executed = 1
There might be slight variations depending on the RDBMS you are using.
I have a fairly complex sql that returns 2158 rows' id from a table with ~14M rows. I'm using CTEs for simplification.
The WHERE consists of two conditions. If i comment out one of them, the other runs in ~2 second. If i leave them both (separated by OR) the query runs ~100 seconds. The first condition alone needs 1-2 seconds and returns 19 rows, the second condition alone needs 0 seconds and returns 2139 rows.
What can be the reason?
This is the complete SQL:
WITH fpcRepairs AS
(
SELECT FPC_Row = ROW_NUMBER()OVER(PARTITION BY t.SSN_Number ORDER BY t.Received_Date, t.Claim_Creation_Date, t.Repair_Completion_Date, t.Claim_Submitted_Date)
, idData, Repair_Completion_Date, Received_Date, Work_Order, SSN_number, fiMaxActionCode, idModel,ModelName
, SP=(SELECT TOP 1 Reused_Indicator FROM tabDataDetail td INNER JOIN tabSparePart sp ON td.fiSparePart=sp.idSparePart
WHERE td.fiData=t.idData
AND (td.Material_Quantity <> 0)
AND (sp.SparePartName = '1254-3751'))
FROM tabData AS t INNER JOIN
modModel AS m ON t.fiModel = m.idModel
WHERE (m.ModelName = 'LT26i')
AND EXISTS(
SELECT NULL
FROM tabDataDetail AS td
INNER JOIN tabSparePart AS sp ON td.fiSparePart = sp.idSparePart
WHERE (td.fiData = t.idData)
AND (td.Material_Quantity <> 0)
AND (sp.SparePartName = '1254-3751')
)
), needToChange AS
(
SELECT idData FROM tabData AS t INNER JOIN
modModel AS m ON t.fiModel = m.idModel
WHERE (m.ModelName = 'LT26i')
AND EXISTS(
SELECT NULL
FROM tabDataDetail AS td
INNER JOIN tabSparePart AS sp ON td.fiSparePart = sp.idSparePart
WHERE (td.fiData = t.idData)
AND (td.Material_Quantity <> 0)
AND (sp.SparePartName IN ('1257-2741','1257-2742','1248-2338','1254-7035','1248-2345','1254-7042'))
)
)
SELECT t.idData
FROM tabData AS t INNER JOIN modModel AS m ON t.fiModel = m.idModel
INNER JOIN needToChange ON t.idData = needToChange.idData -- needs to change FpcAssy
LEFT OUTER JOIN fpcRepairs rep ON t.idData = rep.idData
WHERE
rep.idData IS NOT NULL -- FpcAssy replaced, check if reused was claimed correctly
AND rep.FPC_Row > 1 -- other FpcAssy repair before
AND (
SELECT SP FROM fpcRepairs lastRep
WHERE lastRep.SSN_Number = rep.SSN_Number
AND lastRep.FPC_Row = rep.FPC_Row - 1
) = rep.SP -- same SP, must be rejected(reused+reused or new+new)
OR
rep.idData IS NOT NULL -- FpcAssy replaced, check if reused was claimed correctly
AND rep.FPC_Row = 1 -- no other FpcAssy repair before
AND rep.SP = 0 -- not reused, must be rejected
order by t.idData
Here's the execution plan:
Download: http://www.filedropper.com/exeplanfpc
Try to use UNION ALL of 2 queries separately instead of OR condition.
I've tried it many times and it really helped. I've read about this issue in Art Of SQL .
Read it, you can find many useful information about performance issues.
UPDATE:
Check related questions
UNION ALL vs OR condition in sql server query
http://www.sql-server-performance.com/2011/union-or-sql-server-queries/
Can UNION ALL be faster than JOINs or do my JOINs just suck?
Check Wes's answer
The usage of the OR is probably causing the query optimizer to no longer use an index in the second query.