SQL2000 to SQL2005. Query now a lot slower - sql

This query used to take 3secs in SQL2000, now it takes about 70secs. Both databases give the same results. The 2005 database is not running in compatibility mode.
Currently we're rebuilding the query to run in SQL2005.. by a process of elimination and understanding the logic.
However - can anyone see anything obvious that we've missed.
And/or are there any tools that could help here?
We've been looking at the Execution plan... and profiler. And index tuning wizard.
Profiler points to a massive number more records being queried to get the same results.
I know that this is a very hard question to debug without the data... another pair of eyes is always good if there is anything obvious!
Cheers
Dave
ALTER PROCEDURE [dbo].[GetNodeList]
#ViewID int,
#UserID int = null
as
Select ProcessList.*,
A.NDOC_DOC_ID,
A.NDOC_Order,
A.OMNIBOOK_ID,
A.Node_Order
from (
(SELECT N.NOD_ID,
N.NOD_Name,
N.NOD_Procname,
N.NOD_Xpos,
N.NOD_Ypos,
N.NOD_Zpos,
VN.VNOD_VIE_ID
FROM Node N
INNER JOIN View_NODe VN
ON N.NOD_ID = VN.VNOD_NOD_ID
Where VN.VNOD_VIE_ID = #ViewID) ProcessList
Left Join
(
SELECT N.NOD_ID,
N.NOD_Name,
N.NOD_Procname,
N.NOD_Xpos as NOD_Xpos,
N.NOD_Ypos as NOD_Ypos,
N.NOD_Zpos as NOD_Zpos,
VN.VNOD_VIE_ID,
ND.NDOC_DOC_ID as NDOC_DOC_ID,
ND.NDOC_Order as NDOC_Order,
null as OMNIBOOK_ID,
null as Node_Order
FROM Node N
INNER JOIN View_NODe VN
ON N.NOD_ID = VN.VNOD_NOD_ID
LEFT JOIN NODe_DOCument ND
ON N.NOD_ID = ND.NDOC_NOD_ID
WHERE VN.VNOD_VIE_ID=#ViewID
and ND.NDOC_DOC_ID is not null
and (#UserID is null
or exists (Select 1
from Document D
where Doc_ID = ND.NDOC_DOC_ID
and dbo.fn_UserCanSeeDoc(#UserID,D.Doc_ID)<>0
)
)
UNION
SELECT N.NOD_ID,
N.NOD_Name,
N.NOD_Procname,
N.NOD_Xpos,
N.NOD_Ypos,
N.NOD_Zpos,
VN.VNOD_VIE_ID,
null,
null,
NOM.OMNIBOOK_ID,
NOM.Node_Order
FROM Node N
INNER JOIN View_NODe VN
ON N.NOD_ID = VN.VNOD_NOD_ID
LEFT JOIN NODe_OMNIBOOK NOM
ON N.NOD_ID = NOM.NODE_ID
WHERE VN.VNOD_VIE_ID=#ViewID
and NOM.OMNIBOOK_ID is not null
and exists (select 1 from Omnibook_Doc where OmnibookID = NOM.OMNIBOOK_ID)
) A
--On ProcessList.NOD_ID = A.NOD_ID
ON ProcessList.NOD_Xpos = A.NOD_Xpos
And ProcessList.NOD_Ypos = A.NOD_Ypos
And ProcessList.NOD_Zpos = A.NOD_Zpos
And ProcessList.VNOD_VIE_ID = A.VNOD_VIE_ID
)
ORDER BY
ProcessList.NOD_Xpos,
ProcessList.NOD_Zpos,
ProcessList.NOD_Ypos,
Coalesce(A.NDOC_Order,A.Node_Order),
Coalesce(A.NDOC_DOC_ID,A.OMNIBOOK_ID)

I've seen this before when the statistics haven't kept up with the data. It's possible in this instance that SQL Server 2005 uses the statistics differently to SQL Server 2000. Try rebuilding your statistics for the tables used in the query; so for each table:
UPDATE STATISTICS <table> WITH FULLSCAN
Yes, I'd add the FULLSCAN unless you know your data well enough to think that a sample of records will give good enough results. It'll slow down the stats creation, but will make it more accurate.

Is it possible that your statistics haven't come across? in the 2k5 dbase? So the dbase doesn't have the info needed to make a good plan? As opposed to your old database which has good stats on the table and can choose a better plan for the data?

Could it be an issue with "parameter sniffing", i.e. SQL Server caching a query plan optimized for the parameters supplied for the first execution?
Microsoft technet has more

A college has come up with a solution... regarding bringing the function fn_UserCanSeeDoc back into the SQL.
Shown below is the old commented out function code, then the new inline SQL below it. The code now runs super quick (from over 1 minute to about a second)
Looking at the old SQL I'm surprised how good a job SQL2000 did of running it!
Cheers
--and dbo.fn_UserCanSeeDoc(#UserID,D.Doc_ID)<>0
-- if exists(Select 1 from Omnibook where Omnibook_ID = #DocID)
-- Begin
-- Set #ReturnVal = 1
-- End
--
-- else
-- Begin
-- if exists(
-- Select 1
-- from UserSecurityModule USM
-- Inner join DocSecurity DS
-- On USM.SecurityModuleID = DS.SecurityModuleID
-- where USM.UserID = #UserID
-- and DS.DocID = #DocID
-- )
--
-- Set #ReturnVal = 1
--
-- else
--
-- Set #ReturnVal = 0
-- End
AND D.Doc_ID IN (select DS.DocID from UserSecurityModule USM
Inner join DocSecurity DS
On USM.SecurityModuleID = DS.SecurityModuleID
where USM.UserID = #UserID)

Related

Performance of In Operator with OR conditional SQL

I have common clause in Most of the Procedures like
Select * from TABLE A + Joins where <Conditions>
And
(
-- All Broker
('True' = (Select AllBrokers from SiteUser where ID = #SiteUserID))
OR
(
A.BrokerID in
(
Select BrokerID from SiteUserBroker where SiteUserID
= #SiteUserID)
)
)
So basically if the user has access to all brokers the whole filter should not be applied else if should get the list of Broker
I am bit worries about the performance as this is used in lot of procedures and data has started reaching over 100,000 records and will grow soon, so can this be better written?
ANY Ideas are highly appreciated
One of the techniques is to use built dynamic T-SQL statement and then execute it. Since, this is done in stored procedure you are OK and the idea is simple.
DECLARE #DynamicTSQLStatement NVARCHAR(MAX);
SET #DynamicTSQLStatement = 'base query';
IF 'Getting All Brokers is not allowed '
BEGIN;
SET #DynamicTSQLStatement = #DynamicTSQLStatement + 'aditional where clause'
END;
EXEC sp_executesql #DynamicTSQLStatement;
Or instead of using dynamic T-SQL statement you can have two separate queries - one for users seeing all the data and one for users seeing part of the data. This can lead to code duplication.
Another way, is to turn this OR statement in INNER JOIN. You should test the performance in order to be sure you are getting something from it. The idea is to create a temporary table (it can have primary key or indexes if needed) and store all visible broker ids there - if the users sees all, then Select BrokerID from SiteUserBroker and if the users sees a few - Select BrokerID from SiteUserBroker where SiteUserID = #SiteUserID. In the second way, you are going to simplify the whole statement, but be sure to test if performance is improved.
CREATE TABLE #SiteUserBroker
(
[BrokerID] INT PRIMARY KEY
);
INSERT INTO #SiteUserBroker ([BrokerID])
SELECT BrokerID
FROM SiteUserBroker
where SiteUserID = #SiteUserID
OR ('True' = (Select AllBrokers from SiteUser where ID = #SiteUserID));
Select *
from TABLE A
INNER JOIN #SiteUserBroker B
ON A.BrokerID = B.[BrokerID]
-- other joins
where <Conditions>
As we are using INNER JOIN you can add it at the begging. If there are LEFT JOINs after it, it will affect the performance in positive way.
Adding to #gotqn answer, you can add EXISTS instead of IN (Note - This is not complete answer) -
AND EXISTS (
Select 1/0 from SiteUserBroker X
where A.BrokerID = X.BrokerID AND
X.SiteUserID = #SiteUserID
)
I found that Exists performs better than In in some cases. Please verify your case.

Speed up Update statement T-SQL with Top 1

I'm creating a stored procedure for an ETL script that'll run once per hour to give results of specific operations to users.
I need to find the previous result to the current result. This is fine and I have a working query that I export into Excel. However now I wish to automate the process.
The stored procedure averages at 42 seconds per run. This isn't feasible when running once an hour on the server as I have other automated scripts also running.
My issue is one chunk of the stored procedure averages at 28 seconds, whilst everything else usually takes less than a second (shows up at 00:00:00 in SSMS).
I've managed to reduce the runtime of other chunks myself bringing it down to 42 seconds average, but I can't do this.
I was wondering if any of you know any specfic ways to speed this small chunk up?
UPDATE #tmp
SET prev_test_date = (
SELECT TOP 1 r.test_date
FROM [dbo].[results] r (NOLOCK)
WHERE r.number = #tmp.number
AND r.test_date < #tmp.test_date
ORDER BY r.test_date DESC
)
I was originally going to use joins for this to speed it up, although I can't do this due to the TOP 1 part of the query.
Any ideas?
For this query:
UPDATE #tmp
SET prev_test_date = (
SELECT TOP 1 r.test_date
FROM [dbo].[results] r
WHERE r.number = #tmp.number AND
r.test_date < #tmp.test_date
ORDER BY r.test_date DESC
)
You want an index on r(number, test_date).
If you are using SQL Server 2012+ and the test dates are not duplicated, you can also write this as:
with r as (
select r.*,
lag(r.test_date) over (partition by r.number order by r.test_date desc) as prev_test_date
from [dbo].[results] r
)
update t
set t.prev_test_date = r.prev_test_date
from #tmp t join
r
on t.number = r.number;
In fact, if this is the case, you might not need the temporary table. You might be able to modify the code just to use lag().
UPDATE #tmp
SET prev_test_date = (
SELECT max(r.test_date)
FROM [dbo].[results] r (NOLOCK)
WHERE r.number = #tmp.number
AND r.test_date < #tmp.test_date
)
Without more info it is hard to tell, but if there is simply too much processing You may need to make separate precalculated table and update it incrementally on data change.
UPDATE #tmp
SET #tmp.prev_test_date = tt.maxdate
from #tmp
join
(
select #tmp.number, max(r.test_date) maxdate
from #tmp
join [dbo].[results] r (NOLOCK)
on r.number = #tmp.number
AND r.test_date < #tmp.test_date
group by #tmp.number
) tt
on tt.number = #tmp.number
and have indexes on both #tmp and [results] on number, text_date
I'm having to make some assumptions about the structure and contents of your tables, but if my assumptions are correct, here's the approach I usually use in such situations:
with cteOrderedResults as (
-- Ideally R will be clustered by number, test_date for this
select R.number
,R.test_date
,row_number() over ( partition by R.number
order by R.test_date desc
-- So the most recent R.test_date from
-- before T.test_date gets RowNo=1
) as RowNo
from dbo.results R
inner join #tmp T on R.number=T.number
and R.test_date<T.test_date
)
update T
set T.prev_test_date=R.test_date
from #tmp T
inner join cteOrderedResults R on T.number=R.number
and 1=R.RowNo
This approach works quickly for me on rowsets ranging up to about the million mark. As I've commented, I believe the partitioned row_number() is going to be taking advantage of a corresponding clustered index if it exists; you might find this doesn't work so fast if you don't have the table clustered appropriately.
I agree with comments made elsewhere here, that you should only add the nolock hint back in if you're really sure you need it. If you do, you should use the full correct syntax, with (nolock). From the official MSDN page:
Omitting the WITH keyword is a deprecated feature: This feature will be removed in a future version of Microsoft SQL Server.

Query runs forever ORACLE

Am trying to update the isdeleted column in one table when a record is not in the other user table . My problem is the query l have written runs forever. how best can l write the query below.
update TBLG2O_REGISTER a set a."isDeleted" = '1'
where a."UserID" not in (select k."UserID" from TBLG2O_USER k)
The answer is going to be database engine-specific. Performance characteristic differ wildly, across different database engines, and you failed to specify which DB server you are using.
However, subqueries are frequently MySQL's Achilles heel; I wouldn't be surprised that if this was MySQL. If so, the following approach should have better performance characteristics with MySQL:
update TBLG2O_REGISTER a left join TBLG20_USER k using(UserID)
set a.isDeleted = '1' where k.UserID is null;
Finally got it to work Thank you for your help
Update TBLG2O_REGISTER a set a."isDeleted" = '1' where a."UserID" in (select p."UserID"
from TBLG2O_REGISTER p left join TBLG2O_USER k on p."UserID" =k."UserID"
where k."UserID" is null)

Placement of WITH(NOLOCK) in nested queries

In the Following query where would I place WITH(NOLOCK)?
SELECT *
FROM (SELECT *
FROM (SELECT *
FROM (SELECT *
FROM (SELECT *
FROM dbo.VBsplit(#mnemonicList, ',')) a) b
JOIN dct
ON dct.concept = b.concept
WHERE b.geo = dct.geo) c
JOIN dct_rel z
ON c.db_int = z.db_int) d
JOIN rel_d y
ON y.rel_id = d.rel_id
WHERE y.update_status = 0
GROUP BY y.rel_id,
d.concept,
d.geo_rfa
You should not put NOLOCK anywhere in that query. If you are trying to prevent readers from blocking writers, a much better alternative is READ COMMITTED SNAPSHOT. Of course, you should read about this, just like you should read about NOLOCK before blindly throwing it into your queries:
Is the NOLOCK SQL Server hint bad practice?
Is NOLOCK always bad?
What risks are there if we enable read committed snapshot in SQL Server?
Also, since you're using SQL Server 2008, you should probably replace your VBSplit() function with a table-valued parameter - this will be much more efficient than splitting up a string, even if the function is baked in CLR as implied.
First, create a table type that can hold appropriate strings. I'm going to assume the list is guaranteed to be unique and no individual mnemonic word can be > 900 characters.
CREATE TYPE dbo.Strings AS TABLE(Word NVARCHAR(900) PRIMARY KEY);
Now, you can create a procedure that takes a parameter of this type, and which sets the isolation level of your choosing in one location:
CREATE PROCEDURE dbo.Whatever
#Strings dbo.Strings READONLY
AS
BEGIN
SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL --<choose wisely>;
SELECT -- please list your columns here instead of *
FROM #Strings AS s
INNER JOIN dbo.dct -- please always use proper schema prefix
ON dct.concept = s.Word
...
END
GO
Now you can simply pass a collection (such as a DataTable) in from your app, be it C# or whatever, and not have to assemble or deconstruct a messy comma-separated list at all.
Since the question really is, "where should I put NOLOCK". I am not going do debate the use of OR reformat the query with better joins. I will just answer the question.
In no way am I intending to say this is the better way or to say that the other answers are bad. The other answer solve the actual problem. I'm just intending to show where exactly to place the lock hints as the question asks
SELECT *
FROM (SELECT *
FROM (SELECT *
FROM (SELECT *
FROM (SELECT *
FROM dbo.VBsplit(#mnemonicList, ',')) a) b
JOIN dct WITH (NOLOCK) -- <---
ON dct.concept = b.concept
WHERE b.geo = dct.geo) c
JOIN dct_rel z WITH (NOLOCK) -- <---
ON c.db_int = z.db_int) d
JOIN rel_d y WITH (NOLOCK) -- <---
ON y.rel_id = d.rel_id
WHERE y.update_status = 0
GROUP BY y.rel_id,
d.concept,
d.geo_rfa
Like this, to use the tidiest method.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
SELECT * FROM (SELECT * FROM
(SELECT * FROM (SELECT * FROM
(SELECT * FROM dbo.VBsplit(#mnemonicList,',')) a ) b
JOIN dct ON dct.concept = b.concept WHERE b.geo = dct_variable.geo_rfa) c
JOIN dct_rel z ON c.db_int = z.db_int) d
JOIN rel_d y ON y.rel_id = d.rel_id
WHERE y.update_status = 0
GROUP BY y.rel_id,d.concept,d.geo_rfa
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
However, unless you are using this for reporting purposes on an active database, enabling dirty reads may not be the best way to go.
Edited as (NOLOCK) itself is not deprecated except as described here: http://technet.microsoft.com/en-us/library/ms143729.aspx.

Optimize query in TSQL 2005

I have to optimize this query can some help me fine tune it so it will return data faster?
Currently the output is taking somewhere around 26 to 35 seconds. I also created index based on attachment table following is my query and index:
SELECT DISTINCT o.organizationlevel, o.organizationid, o.organizationname, o.organizationcode,
o.organizationcode + ' - ' + o.organizationname AS 'codeplusname'
FROM Organization o
JOIN Correspondence c ON c.organizationid = o.organizationid
JOIN UserProfile up ON up.userprofileid = c.operatorid
WHERE c.status = '4'
--AND c.correspondence > 0
AND o.organizationlevel = 1
AND (up.site = 'ALL' OR
up.site = up.site)
--AND (#Dept = 'ALL' OR #Dept = up.department)
AND EXISTS (SELECT 1 FROM Attachment a
WHERE a.contextid = c.correspondenceid
AND a.context = 'correspondence'
AND ( a.attachmentname like '%.rtf' or a.attachmentname like '%.doc'))
ORDER BY o.organizationcode
I can't just change anything in db due to permission issues, any help would be much appreciated.
I believe your headache is coming from this part in specific...like in a where exists can be your performance bottleneck.
AND EXISTS (SELECT 1 FROM Attachment a
WHERE a.contextid = c.correspondenceid
AND a.context = 'correspondence'
AND ( a.attachmentname like '%.rtf' or a.attachmentname like '%.doc'))
This can be written as a join instead.
SELECT DISTINCT o.organizationlevel, o.organizationid, o.organizationname, o.organizationcode,
o.organizationcode + ' - ' + o.organizationname AS 'codeplusname'
FROM Organization o
JOIN Correspondence c ON c.organizationid = o.organizationid
JOIN UserProfile up ON up.userprofileid = c.operatorid
left join article a on a.contextid = c.correspondenceid
AND a.context = 'correspondence'
and right(attachmentname,4) in ('.doc','.rtf')
....
This eliminates both the like and the where exists. put your where clause at the bottom.it's a left join, so a.anycolumn is null means the record does not exist and a.anycolumn is not null means a record was found. Where a.anycolumn is not null will be the equivalent of a true in the where exists logic.
Edit to add:
Another thought for you...I'm unsure what you are trying to do here...
AND (up.site = 'ALL' OR
up.site = up.site)
so where up.site = 'All' or 1=1? is the or really needed?
and quickly on right...Right(column,integer) gives you the characters from the right of the string (I used a 4, so it'll take the 4 right chars of the column specified). I've found it far faster than a like statement runs.
This is always going to return true so you can eliminate it (and maybe the join to up)
AND (up.site = 'ALL' OR up.site = up.site)
If you can live with dirty reads then with (nolock)
And I would try Attachement as a join. Might not help but worth a try. Like is relatively expensive and if it is doing that in a loop where it could it once that would really help.
Join Attachment a
on a.contextid = c.correspondenceid
AND a.context = 'correspondence'
AND ( a.attachmentname like '%.rtf' or a.attachmentname like '%.doc'))
I know there are some people on SO that insist that exists is always faster than a join. And yes it is often faster than a join but not always.
Another approach is the create a #temp table using
CREATE TABLE #Temp (contextid INT PRIMARY KEY CLUSTERED);
insert into #temp
Select distinct contextid
from atachment
where context = 'correspondence'
AND ( attachmentname like '%.rtf' or attachmentname like '%.doc'))
order by contextid;
go
select ...
from correspondence c
join #Temp
on #Temp.contextid = c.correspondenceid
go
drop table #temp
Especially if productID is the primary key or part of the primary key on correspondence creating the PK on #temp will help.
That way you can be sure that like expression is only evaluated once. If the like is the expensive part and in a loop then it could be tanking the query. I use this a lot where I have a fairly expensive core query and I need to those results to pick up reference data from multiple tables. If you do a lot of joins some times the query optimizer goes stupid. But if you give the query optimizer PK to PK then it does not get stupid and is fast. The down side is it takes about 0.5 seconds to create and populate the #temp.