I have a table in SQL Server that I inherited from a legacy system thats still in production that is structured according to the code below. I created a SP to query the table as described in the code below the table create statement. My issue is that, sporadically, calls from .NET to this SP both through the Enterprise Library 4 and through a DataReader object are slow. The SP is called through a loop structure in the Data Layer that specifies the params that go into the SP for the purpose of populating user objects. It's also important to mention that a slow call will not take place on every pass the loop structure. It will generally be fine for most of a day or more, and then start presenting which makes it extremely hard to debug.
The table in question contains about 5 million rows. The calls that are slow, for instance, will take as long as 10 seconds, while the calls that are fast will take 0 to 10 milliseconds on average. I checked for locking/blocking transactions during the slow calls, none were found. I created some custom performance counters in the data layer to monitor call times. Essentially, when performance is bad, it's really bad for that one call. But when it's good, it's really good. I've been able to recreate the issue on a few different developer machines, but not on our development and staging database servers, which of course have beefier hardware. Generally, the problem is resolved through restarting the SQL server services, but not always. There are indexes on the table for the fields I'm querying, but there are more indexes than I would like. However, I'm hesitant to remove any or toy with the indexes due to the impact it may have on the legacy system. Has anyone experienced a problem like this before, or do you have a recommendation to remedy it?
CREATE TABLE [dbo].[product_performance_quarterly](
[performance_id] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
[product_id] [int] NULL,
[month] [int] NULL,
[year] [int] NULL,
[performance] [decimal](18, 6) NULL,
[gross_or_net] [char](15) NULL,
[vehicle_type] [char](30) NULL,
[quarterly_or_monthly] [char](1) NULL,
[stamp] [datetime] NULL CONSTRAINT [DF_product_performance_quarterly_stamp] DEFAULT (getdate()),
[eA_loaded] [nchar](10) NULL,
[vehicle_type_id] [int] NULL,
[yearmonth] [char](6) NULL,
[gross_or_net_id] [tinyint] NULL,
CONSTRAINT [PK_product_performance_quarterly_4_19_04] PRIMARY KEY CLUSTERED
(
[performance_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [dbo].[product_performance_quarterly] WITH NOCHECK ADD CONSTRAINT [FK_product_performance_quarterlyProduct_id] FOREIGN KEY([product_id])
REFERENCES [dbo].[products] ([product_id])
GO
ALTER TABLE [dbo].[product_performance_quarterly] CHECK CONSTRAINT [FK_product_performance_quarterlyProduct_id]
CREATE PROCEDURE [eA.Analytics.Calculations].[USP.GetCalculationData]
(
#PRODUCTID INT, --products.product_id
#BEGINYEAR INT, --year to begin retrieving performance data
#BEGINMONTH INT, --month to begin retrieving performance data
#ENDYEAR INT, --year to end retrieving performance data
#ENDMONTH INT, --month to end retrieving performance data
#QUARTERLYORMONTHLY VARCHAR(1), --do you want quarterly or monthly data?
#VEHICLETYPEID INT, --what product vehicle type are you looking for?
#GROSSORNETID INT --are your looking gross of fees data or net of fees data?
)
AS
BEGIN
SET NOCOUNT ON
DECLARE #STARTDATE VARCHAR(6),
#ENDDATE VARCHAR(6),
#vBEGINMONTH VARCHAR(2),
#vENDMONTH VARCHAR(2)
IF LEN(#BEGINMONTH) = 1
SET #vBEGINMONTH = '0' + CAST(#BEGINMONTH AS VARCHAR(1))
ELSE
SET #vBEGINMONTH = #BEGINMONTH
IF LEN(#ENDMONTH) = 1
SET #vENDMONTH = '0' + CAST(#ENDMONTH AS VARCHAR(1))
ELSE
SET #vENDMONTH = #ENDMONTH
SET #STARTDATE = CAST(#BEGINYEAR AS VARCHAR(4)) + #vBEGINMONTH
SET #ENDDATE = CAST(#ENDYEAR AS VARCHAR(4)) + #vENDMONTH
--because null values for gross_or_net_id and vehicle_type_id are represented in
--multiple ways (true null, empty string, or 0) in the PPQ table, need to account for all possible variations if
--a -1 is passed in from the .NET code, which represents an enumerated value that
--indicates that the value(s) should be true null.
IF #VEHICLETYPEID = '-1' AND #GROSSORNETID = '-1'
SELECT
PPQ.YEARMONTH, PPQ.PERFORMANCE
FROM PRODUCT_PERFORMANCE_QUARTERLY PPQ
WITH (NOLOCK)
WHERE
(PPQ.PRODUCT_ID = #PRODUCTID)
AND (PPQ.YEARMONTH BETWEEN #STARTDATE AND #ENDDATE)
AND (PPQ.QUARTERLY_OR_MONTHLY = #QUARTERLYORMONTHLY)
AND (PPQ.VEHICLE_TYPE_ID IS NULL OR PPQ.VEHICLE_TYPE_ID = '0' OR PPQ.VEHICLE_TYPE_ID = '')
AND (PPQ.GROSS_OR_NET_ID IS NULL OR PPQ.GROSS_OR_NET_ID = '0' OR PPQ.GROSS_OR_NET_ID = '')
ORDER BY PPQ.YEARMONTH ASC
IF #VEHICLETYPEID <> '-1' AND #GROSSORNETID <> '-1'
SELECT
PPQ.YEARMONTH, PPQ.PERFORMANCE
FROM PRODUCT_PERFORMANCE_QUARTERLY PPQ
WITH (NOLOCK)
WHERE
(PPQ.PRODUCT_ID = #PRODUCTID)
AND (PPQ.YEARMONTH BETWEEN #STARTDATE AND #ENDDATE)
AND (PPQ.QUARTERLY_OR_MONTHLY = #QUARTERLYORMONTHLY)
AND (PPQ.VEHICLE_TYPE_ID = #VEHICLETYPEID )
AND (PPQ.GROSS_OR_NET_ID = #GROSSORNETID)
ORDER BY PPQ.YEARMONTH ASC
IF #VEHICLETYPEID = '-1' AND #GROSSORNETID <> '-1'
SELECT
PPQ.YEARMONTH, PPQ.PERFORMANCE
FROM PRODUCT_PERFORMANCE_QUARTERLY PPQ
WITH (NOLOCK)
WHERE
(PPQ.PRODUCT_ID = #PRODUCTID)
AND (PPQ.YEARMONTH BETWEEN #STARTDATE AND #ENDDATE)
AND (PPQ.QUARTERLY_OR_MONTHLY = #QUARTERLYORMONTHLY)
AND (PPQ.VEHICLE_TYPE_ID IS NULL OR PPQ.VEHICLE_TYPE_ID = '0' OR PPQ.VEHICLE_TYPE_ID = '')
AND (PPQ.GROSS_OR_NET_ID = #GROSSORNETID)
ORDER BY PPQ.YEARMONTH ASC
IF #VEHICLETYPEID <> '-1' AND #GROSSORNETID = '-1'
SELECT
PPQ.YEARMONTH, PPQ.PERFORMANCE
FROM PRODUCT_PERFORMANCE_QUARTERLY PPQ
WITH (NOLOCK)
WHERE
(PPQ.PRODUCT_ID = #PRODUCTID)
AND (PPQ.YEARMONTH BETWEEN #STARTDATE AND #ENDDATE)
AND (PPQ.QUARTERLY_OR_MONTHLY = #QUARTERLYORMONTHLY)
AND (PPQ.VEHICLE_TYPE_ID = #VEHICLETYPEID)
AND (PPQ.GROSS_OR_NET_ID IS NULL OR PPQ.GROSS_OR_NET_ID = '0' OR PPQ.GROSS_OR_NET_ID = '')
ORDER BY PPQ.YEARMONTH ASC
END
I have seen this happen with indexes that were out of date. It could also be a parameter sniffing problem, where a different query plan is being used for different parameters that come in to the stored procedure.
You should capture the parameters of the slow calls and see if they are the same ones each time it runs slow.
You might also try running the tuning wizard and see if it recommends any indexes.
You don't want to worry about having too many indexes until you can prove that updates and inserts are happening too slow (time needed to modify the index plus locking/contention), or you are running out of disk space for them.
Sounds like another query is running in the background that has locked the table and your innocent query is simply waiting for it to finish
A strange, edge case but I encountered it recently.
If the queries run longer in the application than they do when run from within Management Studio, you may want to check to make sure that Arithabort is set off. The connection parameters used by Management Studio are different from the ones used by .NET.
It seems like it's one of two things - either the parameters on the slow calls are different in some way than on the fast calls, and they're not able to use the indexes as well, or there's some type of locking contention that's holding you up. You say you've checked for blocking locks while a particular process is hung, and saw none - that would suggest that it's the first one. However - are you sure that your staging server (that you can't reproduce this error on) and the development servers (that you can reproduce it on) have the same database configuration? For example, maybe "READ COMMITTED SNAPSHOT" is enabled in production, but not in development, which would cause read contention issues to disappear in production.
If it's a difference in parameters, I'd suggest using SQL Profiler to watch the transactions and capture a few - some slow ones and some faster ones, and then, in a Management Studio window, replace the variables in that SP above with the parameter values and then get an execution plan by pressing "Control-L". This will tell you exactly how SQL Server expects to process your query, and you can compare the execution plan for different parameter combination to see if there's a difference with one set, and work from there to optimize it.
Good luck!
Related
I have a complex unit of work from an application that might commit changes to 10-15 tables as a single transaction. The unit of work executes under snapshot isolation.
Some of the tables have a trigger which executes a stored procedure to log messages into a queue. The message contains the Table Name, Key and Change Type. This is necessary to provide backwards compatibility with SQL2005, I can't use the built in queuing.
The problem is I am getting blocking and time-outs in the queue writing stored procedure. I either get a message saying:
Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.tblObjectChanges' directly or indirectly in database
or I get a timeout writing to that table.
Is there a way to change the transaction isolation of the particular call to (or within) the stored procedure that does the message queue writing, from within the trigger? As a last resort, can I make the call to the delete or update parts of the stored procedure run asynchronously?
Here is the SQL for the Stored Procedure:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[usp_NotifyObjectChanges]
#ObjectType varchar(20),
#ObjectKey int,
#Level int,
#InstanceGUID varchar(50),
#ChangeType int = 2
AS
SET NOCOUNT ON
DECLARE #ObjectChangeID int
--Clean up any messages older than 10 minutes
DELETE from tblObjectChanges Where CreatedTime < DATEADD(MINUTE, -10, GetDate())
--If the object is already in the queue, change the time and instanceID
SELECT #ObjectChangeID = [ObjectChangeID] FROM tblObjectChanges WHERE [ObjectType] = #ObjectType AND [ObjectKey] = #ObjectKey AND [Level] = #Level
IF NOT #ObjectChangeID is NULL
BEGIN
UPDATE [dbo].[tblObjectChanges] SET
[CreatedTime] = GETDATE(), InstanceGUID = #InstanceGUID
WHERE
[ObjectChangeID] = #ObjectChangeID
END
ELSE
BEGIN
INSERT INTO [dbo].[tblObjectChanges] (
[CreatedTime],
[ObjectType],
[ObjectKey],
[Level],
ChangeType,
InstanceGUID
) VALUES (
GETDATE(),
#ObjectType,
#ObjectKey,
#Level,
#ChangeType,
#InstanceGUID
)
END
Definition of tblObjectChanges:
CREATE TABLE [dbo].[tblObjectChanges](
[CreatedTime] [datetime] NOT NULL,
[ObjectType] [varchar](20) NOT NULL,
[ObjectKey] [int] NOT NULL,
[Rowversion] [timestamp] NOT NULL,
[Level] [int] NOT NULL,
[ObjectChangeID] [int] IDENTITY(1,1) NOT NULL,
[InstanceGUID] [varchar](50) NULL,
[ChangeType] [int] NOT NULL,
CONSTRAINT [PK_tblObjectChanges] PRIMARY KEY CLUSTERED
(
[ObjectChangeID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]
GO
This line is almost certainly your problem:
DELETE from tblObjectChanges Where CreatedTime < DATEADD(MINUTE, -10, GetDate())
There are two BIG problems with this statement. First, according to your table definition, CreatedTime is not indexed. This means that in order to execute this statement, the entire table must be scanned, and that will cause the entire table to be locked for the duration of whatever transaction this happens to be a part of. So put an index on this column.
The second problem, is that even with an index, you really shouldn't be performing operational maintenance tasks like this from within a trigger. Besides slowing down the OLTP transactions that have to execute it, this statement only really needs to be executed once every 5-10 minutes. Instead, you are executing it any time (and every time) any of these tables are modified. That is a lot of additional load that gets worse as your system gets busier.
A better approach would be to take this statement out of the triggers entirely, and instead have a SQL Agent Job that runs every 5-10 minutes to execute this clean-up operation. If you do this along with adding the index, most of your problems should disappear.
An additional problem is this statement:
SELECT #ObjectChangeID = [ObjectChangeID] FROM tblObjectChanges WHERE [ObjectType] = #ObjectType AND [ObjectKey] = #ObjectKey AND [Level] = #Level
Unlike the first statement above, this statement belongs in the trigger. However, like the first statement, it too will have (and cause) serious performance and locking issues under load, because again, according to your posted table definition, none of the columns being searched are indexed.
The solution again is to put an additional index on these columns as well.
A few ideas:
Move the delete into a separate scheduled job if possible
Add an index on CreatedTime
Add an index on ObjectType, ObjectKey, Level
add WITH(UPDLOCK, ROWLOCK) to the SELECT
add WITH(ROWLOCK) to the INSERT and the UPDATE
You need to test all of these to see what helps. I would go through them in this order, but see the note below.
Even if you decide against all this, at least leave the WITH(UPDLOCK) on the SELECT as you otherwise might loose updates.
My manager assigned me a project whereby I have been using jQuery Calendar plugin to display calendar data stored across various tables in our SQL database.
It's just a jQuery plugin that takes static json data and renders it on a calendar. I had to integrate it with .net and our SQL database in such a way that the calendar could render the data from the SQL database (Microsoft SQL.)
Initially we put this together in such a way that we fetched all the data from the SQL server and then used .Net to construct the json and then pass it onto the jQuery calendar plugin.
Although in principle this worked well, it was extremely slow and IIS was often timing out. Not to mention, every time any of us wanted to view the calendar we had to wait around 3mins since the number of entries is approaching 3000.
The queries were quite complex, they're using on the fly Dateadd and DateDiff functions and alsorts of manners of operations. Execution time on the SQL server alone was around 90 seconds for the query. In total query size was around 160kb.
We then split the query into 3 parts (for different departments), but the amount of time we have to wait for the calendar to draw is still over a minute.
Here is an example of just one of the queries but there are over 100 of these per department
CREATE TABLE #AnnualLastMonImportantCustomDate(
Title varchar(550) COLLATE Latin1_General_CI_AS NULL,
AllocatedDate varchar(550) COLLATE Latin1_General_CI_AS NULL,
EndDateTime varchar(550) COLLATE Latin1_General_CI_AS NULL,
url varchar(550) COLLATE Latin1_General_CI_AS NULL,
width varchar(10) COLLATE Latin1_General_CI_AS NULL,
height varchar(550) COLLATE Latin1_General_CI_AS NULL,
AllDay varchar(550) COLLATE Latin1_General_CI_AS NULL,
description varchar(550) COLLATE Latin1_General_CI_AS NULL,
color varchar(550) COLLATE Latin1_General_CI_AS NULL,
textColor varchar(550) COLLATE Latin1_General_CI_AS NULL
)
DECLARE db_cursor CURSOR FOR SELECT AlertDate FROM xsCRMAlerts
WHERE AlertType='InternalImportantDate'
-- cursor is the results row when table goes through fetch process
SET #MyTableName='xsCRMAlerts'
OPEN db_cursor -- opens the table and stores id, which is the primary key in the table
FETCH NEXT FROM db_cursor INTO #MyTableName -- #MyTableName in this case is the result row.
WHILE ##FETCH_STATUS = 0 -- 0 is for success -1 is for too many results -2 is for the row fetched is missing
BEGIN
-- Below between begin and end the statement is linked to a function, which gives the dates tabled based on a start date. This table is then cross joined to produce desired result.
SET #startDate = #MyTableName -- we can set the start date to all the data we recieved because we have only asked for one field in our #MyTableName query when db_cursor was being drawn
INSERT INTO #AnnualLastMonImportantCustomDate
SELECT
'Important Date : ' + [Title] as 'Title',
dr.date as 'AllocatedDate',
dr.date as 'EndDateTime' ,
'xsCRM_Dates_Edit.aspx?id=' + cast(id as varchar) as 'url' ,
'515px' as 'width',
'410px' as 'height',
'true' as 'allDay',
'Important date' as 'description', /* This is a static entry and will not show on the calendar. Used when redering object*/
'yellow' as 'color',
'black' as 'textColor'
FROM [DelphiDude].[dbo].[xsCRMAlerts]
cross JOIN
dateTable(
DATEADD(yy,DATEDIFF(yy,0,GETDATE()),0)
,
DateAdd(yy,1,DATEADD(ms,-3,DATEADD(yy,0,DATEADD(yy,DATEDIFF(yy,0,GETDATE())+1,0))))
) dr -- You can specify intervals by calling DateTable_Month, DateTable_Quarter, DateTable_BiAnnual and DateTable_Annual
WHERE
(AlertType='InternalImportantDate') and
(occurring='765') and
(Datepart(m,date) = 12) and
(Datepart(day,date) > 24) and
(Datepart(dw,date) = 2) and
(Datepart(year,date) = (Datepart(year,getDate()) + 1))
FETCH NEXT FROM db_cursor INTO #MyTableName -- gets the next record from the table
END
CLOSE db_cursor
DEALLOCATE db_cursor
We really do need these queries.
We've now thought about limiting the result set to just the prev and next 30 days.
But each time we optimise a query, I then (even if its just using find and replace) have to replicate that change across 100 queries per module.
Is there a way we can optimise these queries and speed up the execution and calendar rendering time thats definitive and improves it by a long shot? And is there a way that I can apply the changes in such a way that it replicates across each of the queries?
I suggested using caching, db caching and object caching to my boss but he said the data would be changing often and data from here needs to be passed onto other modules and therefore if it is cached could be inaccurate. I dont have enough experience to contest what he was saying.
Any advice anyone?
in the query that you post, the cursor is useless because you never use the #startDate or #MyTableName variable in the insert query.
So a lot of duplicate rows are potentially inserted in your temp table.
Also, try to use either a CTE or a table variable instead of the "#Temporary table" because the data of "#Temporary tables" are stored phisically on the filesystem and cost a lot of I/O increasing the execution time.
Last advice : don't forget to create clustered/non-clustered indexes on your xsCRMAlerts table. If you are using SQL Server Management Studio, the execution plan or the Database Engine Tunning Advisor tool can help you a lot to find missing indexes.
Hope this helps :)
I'm selecting the available login infos from a DB randomly via the stored procedure below. But when multiple threads want to get the available login infos, duplicate records are returned although I'm updating the timestamp field of the record.
How can I lock the rows here so that the record returned once won't be returned again?
Putting
WITH (HOLDLOCK, ROWLOCK)
didn't help!
SELECT TOP 1 #uid = [LoginInfoUid]
FROM [ZPer].[dbo].[LoginInfos]
WITH (HOLDLOCK, ROWLOCK)
WHERE ([Type] = #type)
...
...
...
ALTER PROCEDURE [dbo].[SelectRandomLoginInfo]
-- Add the parameters for the stored procedure here
#type int = 0,
#expireTimeout int = 86400 -- 24 * 60 * 60 = 24h
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
DECLARE #processTimeout int = 10 * 60
DECLARE #uid uniqueidentifier
BEGIN TRANSACTION
-- SELECT [LoginInfos] which are currently not being processed ([Timestamp] is timedout) and which are not expired.
SELECT TOP 1 #uid = [LoginInfoUid]
FROM [MyDb].[dbo].[LoginInfos]
WITH (HOLDLOCK, ROWLOCK)
WHERE ([Type] = #type) AND ([Uid] IS NOT NULL) AND ([Key] IS NOT NULL) AND
(
([Timestamp] IS NULL OR DATEDIFF(second, [Timestamp], GETDATE()) > #processTimeout) OR
(
DATEDIFF(second, [UpdateDate], GETDATE()) <= #expireTimeout OR
([UpdateDate] IS NULL AND DATEDIFF(second, [CreateDate], GETDATE()) <= #expireTimeout)
)
)
ORDER BY NEWID()
-- UPDATE the selected record so that it won't be re-selected.
UPDATE [MyDb].[dbo].[LoginInfos] SET
[UpdateDate] = GETDATE(), [Timestamp] = GETDATE()
WHERE [LoginInfoUid] = #uid
-- Return the full record data.
SELECT *
FROM [MyDb].[dbo].[LoginInfos]
WHERE [LoginInfoUid] = #uid
COMMIT TRANSACTION
END
Locking a row in shared mode doesn't help a bit in preventing multiple threads from reading the same row. You want to lock the row exclusivey with XLOCK hint. Also you are using a very low precision marker determining candidate rows (GETDATE has 3ms precision) so you will get a lot of false positives. You must use a precise field, like a bit (processing 0 or 1).
Ultimately you are treating the LoginsInfo as a queue, so I suggest you read Using tables as Queues. The way to achieve what you want is to use UPDATE ... WITH OUTPUT. But you have an additional requirement to select a random login, which would throw everything haywire. Are you really, really, 100% convinced that you need randomness? It is an extremely unusual requirement and you will have a heck of hard time coming up with a solution that is correct and performant. You'll get duplicates and you're going to deadlock till the day after.
A first attempt would go something like:
with cte as (
select top 1 ...
from [LoginInfos] with (readpast)
where processing = 0 and ...
order by newid())
update cte
set processing = 1
output cte...
But because the NEWID order requires a full table scan and sort to pick the 1 lucky winner row, you will be 1) extremely unperformant and 2) deadlock constantly.
Now you may take this a a random forum rant, but it so happens I've been working with SQL Server backed queues for some years now and I know what you want will not work. You must modify your requirement, specifically the randomness, and then you can go back to the article linked above and use one of the true and tested schemes.
Edit
If you don't need randomess then is somehow simpler. The gist of the tables-as-queues issue is that you must seek your output row, you absolutely cannot scan for it. Scanning over a queue is not only unperformed, is a guaranteed deadlock because of the way queues are used (highly concurent dequeue operations where all threads want the same row). To achieve this your WHERE clause must be sarg-able, which is subject to 1) your expressions in the WHERE clause and 2) the clustered index key. Your expression cannot contain OR conditions, so loose all the IS NULL OR ..., modify the fields to be non-nullable and always populate them. Second, your must compare in an index freindly manner, not DATEDIFF(..., field, ...) < #variable) but instead always use field < DATEDIDD (..., #variable, ...) because the second form is SARG-able. And you must settle for one of the two fields, [Timestamp] or [UpdateDate], you cannot seek on both. All these, of course, call for a much more strict and tight state machine in your application, but that is a good thing, the lax conditions and OR clauses are only indication of poor data input.
select #now = getdate();
select #expired = dateadd(second, #now, #processTimeout);
with cte as (
select *
from [MyDb].[dbo].[LoginInfos] WITH (readpast, xlock)
WHERE
[Type] = #type) AND
[Timestamp] < #expired)
update cte
set [Timestamp] = #now
output INSERTED.*;
For this to work, the clustered index of the table must be on ([Type], [Timestamp]) (which implies making the primary key LoginInfoId a non-clustered index).
Morning All,
I have a website I am working on that is around 2000 pages of code it is a social media site for businesses. It has the potential for millions of users. Currently we have around 80,000 users and the site access is getting sluggish. I am using 98% stored procedures in the site to improve speed. What I want to know is what can I do to improve data extraction speed and increase the site loading times. I am my knowledge the Member table of the database is not using full text indexing, would that make a difference? I guess it would for searching. But, for example, when logging in it takes a while to load. Here is the login SP script:
SELECT
a.MemberID,
CAST (ISNULL(a.ProfileTypeID,0) AS bit) AS HasProfile,
a.TimeOffsetDiff * a.TimeOffsetUnits AS TimeOffset,
b.City,
b.StateName AS State,
b.StateAbbr AS abbr,
b.Domain,
b.RegionID,
a.ProfileTypeID,
sbuser.sf_DisplayName(a.MemberID) AS DisplayName,
a.UserName,
a.ImgLib,
a.MemberREgionID AS HomeRegionID,
a.StateID,
a.IsSales,
a.IsAdmin
FROM Member a
INNER JOIN Region b ON b.RegionID = a.MemberRegionID
WHERE a.MemberID = #MemberID
UPDATE Member SET NumberLogins = (NumberLogins + 1) WHERE MemberID = #MemberID
Considering this is hunting through only 80,000 members and can take up to 15 secs to login, I consider that to be real slow. Any thoughts on how I can increase login speed?
Obviously, extracting member lists into pages can be laborious too. I recently update outdated tf scripting that contained temporary datasets and the like for paging and replaced it with the following example:
IF #MODE = 'MEMBERSEARCHNEW'
DECLARE #TotalPages INT
BEGIN
SELECT #TotalPages = COUNT(*)/#PageSize
FROM Member a
LEFT JOIN State b ON b.StateID = a.StateID
WHERE (sbuser.sf_DisplayName(a.MemberID) LIKE #UserName + '%')
AND a.MemberID <> #MemberID;
WITH FindSBMembers AS
(
SELECT ROW_NUMBER() OVER(ORDER BY a.Claimed DESC, sbuser.sf_MemberHasAvatar(a.MemberID) DESC) AS RowNum,
a.MemberID, -- 1
a.UserName, -- 2
a.PrCity, -- 3
b.Abbr, -- 4
sbuser.sf_MemberHasImages(a.MemberID) AS MemberHasImages, -- 5
sbuser.sf_MemberHasVideo(a.MemberID) AS MemberHasVideo, -- 6
sbuser.sf_MemberHasAudio(a.MemberID) AS MemberHasAudio, -- 7
sbuser.sf_DisplayName(a.MemberID) AS DisplayName, -- 8
a.ProfileTypeID, -- 9
a.Zip, -- 10
a.PhoneNbr, -- 11
a.PrPhone, -- 12
a.Claimed, -- 13
#TotalPages AS TotalPages -- 14
FROM Member a
LEFT JOIN State b ON b.StateID = a.StateID
WHERE (sbuser.sf_DisplayName(a.MemberID) LIKE #UserName + '%')
AND a.MemberID <> #MemberID
)
SELECT *
FROM FindSBMembers
WHERE RowNum BETWEEN (#PG - 1) * #PageSize + 1
AND #PG * #PageSize
ORDER BY Claimed DESC, sbuser.sf_MemberHasAvatar(MemberID) DESC
END
Is there any further way I can squeeze any more speed out of this script..?
I have had other suggestions including gzip compression, break the Member table into 26 tables based on letters of the alphabet. I am interested to know how the big companies do it, how do they arrange their data, sites like Facebook, Yelp, Yellow Pages, Twitter. I am currently running on a shared hosting server, would an upgrade to VPS or Dedicated server help improve speed.
The site is written in Classic ASP, utilizing SQL Server 2005.
Any help that any of you can provide will be greatly appreciated.
Best Regards and Happy Coding!
Paul
**** ADDITION START:
set ANSI_NULLS ON
set QUOTED_IDENTIFIER ON
GO
ALTER FUNCTION [sbuser].[sf_DisplayName](#MemberID bigint)
RETURNS varchar(150)
AS
BEGIN
DECLARE #OUT varchar(150)
DECLARE #UserName varchar(50)
DECLARE #FirstName varchar(50)
DECLARE #LastName varchar(50)
DECLARE #BusinessName varchar(50)
DECLARE #DisplayNameTypeID int
SELECT
#FirstName = upper(left(FirstName, 1)) + right(FirstName, len(FirstName) - 1),
#LastName = upper(left(LastName, 1)) + right(LastName, len(LastName) - 1) ,
#BusinessName = upper(left(BusinessName, 1)) + right(BusinessName, len(BusinessName) - 1),
#UserName = upper(left(UserName, 1)) + right(UserName, len(UserName) - 1),
/*
#FirstName = FirstName,
#LastName = LastName,
#BusinessName = BusinessName,
#UserName = UserName,
*/
#DisplayNameTypeID = DisplayNameTypeID
FROM Member
WHERE MemberID = #MemberID
IF #DisplayNameTypeID = 2 -- FIRST / LAST NAME
BEGIN
/*SET #OUT = #FirstName + ' ' + #LastName*/
SET #OUT = #LastName + ', ' + #FirstName
END
IF #DisplayNameTypeID = 3 -- FIRST NAME / LAST INITIAL
BEGIN
SET #OUT = #FirstName + ' ' + LEFT(#LastName,1) + '.'
END
IF #DisplayNameTypeID = 4 -- BUSINESS NAME
BEGIN
SET #OUT = #BusinessName + ''
END
RETURN #OUT
END
**** ADDITION END
80000 isn't a whole lot of records, unless you either have no indexes, or your data types are huge. if that query really is your bottle next, then you might want to consider creating covering indexes on the members table and the region table.
create an index on the member table with memberid as the index, and include profiletypeid, timeoffsetdiff, timeoffsetunits, profiletypeid, memberid, username, imglib, memberregionid, stateid, issales, isadmin.
also, jsut noticed your function sbuser.sf_DisplayName(a.memberID). you might explore into that function to make sure that that isn't your true bottle neck.
First option to speed up sf_DisplayName is to add FirstName, LastName etc from members as parameters and use that to build the DisplayName instead of doing a lookup against the member table.
After that you could consider to add DisplayName as a computed and persisted column to the member table. That means that the DisplayName will be calculated when the member is saved and the saved value will be used when you do the query. You can also add a index on the DisplayName column.
GetDisplayName function must be created with with schemabinding
create function dbo.GetDisplayName(
#FirstName varchar(50),
#LastName varchar(50),
#DisplayNameType int)
returns varchar(102) with schemabinding
as
begin
declare #Res varchar(102)
set #Res = ''
if #DisplayNameType = 1
set #Res = #FirstName+' '+#LastName
if #DisplayNameType = 2
set #Res = #LastName+', '+#FirstName
return #Res
end
The table with the persisted column DisplayName
CREATE TABLE [dbo].[Member](
[ID] [int] NOT NULL,
[FirstName] [varchar](50) NOT NULL,
[LastName] [varchar](50) NOT NULL,
[DisplayNameType] [int] NOT NULL,
[DisplayName] AS ([dbo].[GetDisplayName]([FirstName],[LastName],[DisplayNameType])) PERSISTED,
CONSTRAINT [PK_Member] PRIMARY KEY CLUSTERED
(
[ID] ASC
)
)
The index on DisplayName
CREATE INDEX [IX_Member_DisplayName] ON [dbo].[Member]
(
[DisplayName] ASC
)
You should also have a closer look at what you are doing in sf_MemberHasImages, sf_MemberHasVideo and sf_MemberHasAudio. They are used in the column list of the cte. Not as bad as used in the where clause but they could still cause you problems.
The last one I spotted as a potential problem is sf_MemberHasAvatar. It is used in a order by at two places. But the order by in row_number() is used like a where because of the filtering in the main query where clause WHERE RowNum BETWEEN (#PG - 1) * #PageSize + 1.
The technique described with persisted column might be possible to use on the other functions as well.
Quick n dirty way to take the UDF call out of "every row"
SELECT *, sbuser.sf_DisplayName(MemberID) FROM (
SELECT
a.MemberID,
CAST (ISNULL(a.ProfileTypeID,0) AS bit) AS HasProfile,
a.TimeOffsetDiff * a.TimeOffsetUnits AS TimeOffset,
b.City,
b.StateName AS State,
b.StateAbbr AS abbr,
b.Domain,
b.RegionID,
a.ProfileTypeID,
a.UserName,
a.ImgLib,
a.MemberREgionID AS HomeRegionID,
a.StateID,
a.IsSales,
a.IsAdmin
FROM Member a
INNER JOIN Region b ON b.RegionID = a.MemberRegionID
WHERE a.MemberID = #MemberID
)
another way, if you don't want to modify any tables, is to just put the udf logic in the select statement:
case DisplayNameTypeID
when 2 then upper(left(LastName, 1)) + right(LastName, len(LastName) - 1) + ', ' + upper(left(FirstName, 1)) + right(FirstName, len(FirstName) - 1)
when 3 then upper(left(FirstName, 1)) + right(FirstName, len(FirstName) - 1) + ' ' + upper(left(LastName, 1))
when 4 then upper(left(BusinessName, 1)) + right(BusinessName, len(BusinessName) - 1)
end as DisplayName
yeah it looks a bit gorey, but all you have to do is modify the sp.
Put indexes on the primary and foreign keys (MemberID, RegionID, MemberRegionID)
#Tom Gullen - In this instance the fact that Classic ASP is used would seem to be an irrelevance, since the actual cost in terms of computing in this instance seems to be with SQL (or whatever db tech this is running on).
#the question - I'd agree with Cosmin that indexing the relevant fields within the tables would provide a definite performance gain, assuming they're not already done.
We had this case about a week ago where my boss was trying to do multiple conditional inserts from a batch file which was taking forever. We place a single index on a userid fields, and hey presto, the same script took about a minute to execute.
Indexing!
Initial Thoughts
The problem here probably isn't with your stored procedures. Especially in regards to the login script, you are focussing your attention in a small and irrelevant place, as a login command is a one off cost and you can have a much much higher tolerance for script execution time of those sort of pages.
You are using classic ASP, which is quite out of date now. When you are dealing with so many visitors, your server is going to need a lot of power to manage all those requests that it is interpreting. Interpreted pages will run slower than compiled pages.
Time Events
If you are convinced the database is being slow, use times in your script. Add a general timer at the top of the page, and an SQL timer.
Page start load, initialise general time. When you reach a stored procedure, start the SQL timer. When query has finished, stop the SQL timer. At the end of the page you have two timers, one totalling the time spent running SQL, and the other timer - SQL timer gives you total time for executing code. This helps you separate your database from your code in regards to efficiency.
Improving ASP Page Performance
I've detailed good ASP page design here:
VBScript Out Of Memory Error
Also consider:
Use Option Explicit at the top of your pages.
Set Response.Buffer = True
Use response.write inside <% %>, repeatedly opening and closing these is slow
I'll re-iterate what I said in the linked answer, by far, by far the best thing you can do for performance is to dump recordset results into an array with .getRows(). Do not loop recordsets. Do not select fields in queries you do not use. Only have 1 recordset, and 1 ado connection per page. I really recommend you read the link for good ASP page design.
Upgrade if no Glaring Issues
What are the specs of the server? Upgrading the hardware is probably your best route to increased performance in this instance, and most efficient in regards to cost/reward.
To replace the UDF, if that is the problem, I recommend having one field in the Member table to store the DisplayName as the data seems to be rather static from the looks of your function. You only need to update the field once in the beginning, and from then on only when someone registers or DisplayNameTypeID is changed. I hope this is helpful for you.
I have an SP that takes 10 seconds to run about 10 times (about a second every time it is ran). The platform is asp .net, and the server is SQL Server 2005. I have indexed the table (not on the PK also), and that is not the issue. Some caveats:
usp_SaveKeyword is not the issue. I commented out that entire SP and it made not difference.
I set #SearchID to 1 and the time was significantly reduced, only taking about 15ms on average for the transaction.
I commented out the entire stored procedure except the insert into tblSearches and strangely it took more time to execute.
Any ideas of what could be going on?
set ANSI_NULLS ON
go
ALTER PROCEDURE [dbo].[usp_NewSearch]
#Keyword VARCHAR(50),
#SessionID UNIQUEIDENTIFIER,
#time SMALLDATETIME = NULL,
#CityID INT = NULL
AS
BEGIN
SET NOCOUNT ON;
IF #time IS NULL SET #time = GETDATE();
DECLARE #KeywordID INT;
EXEC #KeywordID = usp_SaveKeyword #Keyword;
PRINT 'KeywordID : '
PRINT #KeywordID
DECLARE #SearchID BIGINT;
SELECT TOP 1 #SearchID = SearchID
FROM tblSearches
WHERE SessionID = #SessionID
AND KeywordID = #KeywordID;
IF #SearchID IS NULL BEGIN
INSERT INTO tblSearches
(KeywordID, [time], SessionID, CityID)
VALUES
(#KeywordID, #time, #SessionID, #CityID)
SELECT Scope_Identity();
END
ELSE BEGIN
SELECT #SearchID
END
END
Why are you using top 1 #SearchID instead of max (SearchID) or where exists in this query? top requires you to run the query and retrieve the first row from the result set. If the result set is large this could consume quite a lot of resources before you get out the final result set.
SELECT TOP 1 #SearchID = SearchID
FROM tblSearches
WHERE SessionID = #SessionID
AND KeywordID = #KeywordID;
I don't see any obvious reason for this - either of aforementioned constructs should get you something semantically equivalent to this with a very cheap index lookup. Unless I'm missing something you should be able to do something like
select #SearchID = isnull (max (SearchID), -1)
from tblSearches
where SessionID = #SessionID
and KeywordID = #KeywordID
This ought to be fairly efficient and (unless I'm missing something) semantically equivalent.
Enable "Display Estimated Execution Plan" in SQL Management Studio - where does the execution plan show you spending the time? It'll guide you on the heuristics being used to optimize the query (or not in this case). Generally the "fatter" lines are the ones to focus on - they're ones generating large amounts of I/O.
Unfortunately even if you tell us the table schema, only you will be able to see actually how SQL chose to optimize the query. One last thing - have you got a clustered index on tblSearches?
Triggers!
They are insidious indeed.
What is the clustered index on tblSearches? If the clustered index is not on primary key, the database may be spending a lot of time reordering.
How many other indexes do you have?
Do you have any triggers?
Where does the execution plan indicate the time is being spent?