SQL Index Update with Covering Columns - sql

I am creating an index on a table and I want to include a covering column: messageText nvarchar(1024)
After insertion, the messageText is never updated, so it's an ideal candidate to include in a covering index to speed up lookups.
But what happens if I update other columns in same index?
Will the entire row in the index need reallocating or will just that data from the updated column be updated in the index?
Simple Example
Imaging the following table:
CREATE TABLE [Messages](
[messageID] [int] IDENTITY(1,1) NOT NULL,
[mbrIDTo] [int] NOT NULL,
[isRead] [bit] NOT NULL,
[messageText] [nvarchar](1024) NOT NULL
)
And the following Index:
CREATE NONCLUSTERED INDEX [IX_messages] ON [Messages] ( [mbrIDTo] ASC, [messageID] ASC )
INCLUDE ( [isRead], [messageText])
When we update the table:
UPDATE Messages
SET isRead = 1
WHERE (mbrIDTo = 6546)
The query plan shows that the index IX_messages is utilized and will also be updated becuase the column isRead is part of the index.
Therefore does including large text fields (such as messageText in the above) as part of a covering column in an index, impact performance when other values, in that same index, are updated?

When a row is updated in SQL Server, the entire row is deleted and a new row with the updated records is inserted. Therefore, even if the messageText field is not changing, it will still have to be re-written to the disk.
Here is a blog post from Paul Randall with a good example: http://www.sqlskills.com/blogs/paul/do-changes-to-index-keys-really-do-in-place-updates/

Related

If I place a composite index on three columns and use them in the same query but in different places, will it still be effective?

With the following table and index:
CREATE TABLE [Ticket]
(
[Id] BIGINT IDENTITY NOT NULL,
[Title] CHARACTER VARYING(255) NOT NULL,
[Description] CHARACTER VARYING(MAX) NOT NULL,
[Severity] INTEGER NOT NULL,
[Priority] INTEGER NOT NULL,
[CreatedOn] DATETIMEOFFSET NOT NULL,
PRIMARY KEY([Id])
);
CREATE INDEX [Ticket_Priority_Severity_CreatedOn_IX] ON [Ticket]([Priority], [Severity], [CreatedOn]);
Will the following query:
SELECT [Id]
FROM [Ticket]
WHERE [Priority] = 1
ORDER BY [Severity] DESC, [CreatedOn] ASC
make use of the entire composite index or only utilize the [Priority] part of the index?
I know that for a query that had all of the columns in the WHERE clause, the whole index would be used. I am unsure about the above case though!
Given the actual execution plan below, on a table with no statistics, I am not sure how to interpret it.
It does look like it used the index, but which parts? There is clearly a sort cost, but is that sorting by [Severity] and then [CreatedOn] after doing a seek on [Priority]?
It may use the index, but it will only use the Priority part efficiently since you have the index sorted in a way that is not optimal for the query;
ORDER BY [Severity] DESC, [CreatedOn] ASC
vs
CREATE INDEX [Ticket_Priority_Severity_CreatedOn_IX] ON
[Ticket]([Priority], [Severity], [CreatedOn]);
As you can see in this fiddle if you click the execution plan, the query is split into an index seek and a sort.
Since Severity is sorted ascended, the index won't be (optimally) used for the sort. If you really want an optimal sort, index Severity descending as your query uses it;
CREATE INDEX [Ticket_Priority_Severity_CreatedOn_IX] ON
[Ticket]([Priority], [Severity] DESC, [CreatedOn]);
An SQLfiddle with the fixed index. Note that the whole query is now an index seek.
Note that the plan may look different for you depending on your data, but in general this is true, an index sorted the same way as the query accesses it will use the index better.

Index to enforce a single TRUE value per table in bit column

I have table that contains an IsDefault column:
CREATE TABLE CustomerType
(
ID int IDENTITY(1,1) NOT NULL,
Name nvarchar(50) NOT NULL,
IsDefault bit NOT NULL
)
The IsDefault value should, naturally, be TRUE for only a single row, all other rows should be FALSE. I want to enforce this rule on the database level.
Currently I achieve this by adding a new computed column and placing a UNIQUE NONCLUSTERED INDEX on it:
CREATE TABLE CustomerType
(
ID int IDENTITY(1,1) NOT NULL,
Name nvarchar(50) NULL,
IsDefault bit NOT NULL
IsDefaultConstraint AS (CASE WHEN IsDefault = 1 THEN 1 ELSE -ID END),
)
CREATE UNIQUE NONCLUSTERED INDEX UQ_CustomerType_IsDefault ON CustomerType
(
IsDefaultConstraint ASC
)
This works just fine, but has a bit of code smell to it because the extra column doesn't contain relevant data and is just used for enforcing the unique index.
Are there alternative ways to enforce the same behavior?
For SQL Server 2008 or later, use a filtered index:
CREATE UNIQUE INDEX IX_Default on CustomerType (IsDefault) WHERE IsDefault = 1
For older versions, you use the "poor man's filtered index", the indexed view:
CREATE VIEW dbo.DRI_CustomerType_Default
WITH SCHEMABINDING
AS
SELECT IsDefault FROM dbo.CustomerType WHERE IsDefault = 1
GO
CREATE UNIQUE CLUSTERED INDEX IX_Default on DRI_CustomerType_Default (IsDefault)
Unfortunately SQL-Server doesn't provide function based indexes, which is what you are looking for. So your approach is the best available.
If the additional column is too annoying, then use a view on the table hiding that column.
If this is still annoying you, switch to Oracle ;-)

SQL Server why index is not used

I have a following table in SQL Server 2008 database:
CREATE TABLE [dbo].[Actions](
[ActionId] [int] IDENTITY(1,1) NOT NULL,
[ActionTypeId] [int] NOT NULL,
[Name] [nvarchar](50) NOT NULL,
[Description] [nvarchar](1000) NOT NULL,
[Comment] [nvarchar](500) NOT NULL,
[Created] [datetime] NOT NULL,
[Executed] [datetime] NULL,
[DisplayText] [nvarchar](1000) NULL,
[ExecutedBy] [int] NULL,
[Result] [int] NULL
)
CONSTRAINT [PK_Actions] PRIMARY KEY CLUSTERED
(
[CaseActionId] ASC
)
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_Actions_Executed] ON [dbo].[Actions]
(
[Executed] ASC,
[ExecutedBy] ASC
)
There are 20 000 rows which has Executed date equal to '2500-01-01' and 420 000 rows which has Executed date < '2500-01-01'.
When I execute a query
select CaseActionId, Executed, ExecutedBy, DisplayText from CaseActions
where Executed='2500-01-01'
the query plans shows that the clustered index scan on PK_Actions is performed and the index IX_Actions_Executed is not used at all.
What funny I got missing index hint which says
/* The Query Processor estimates that implementing the following index could improve the query cost by 99.9901%.
*/
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Actions] ([Executed])
But the index is already there.
Why the index is not used if it would select 5% of the data ?
Most likely, the query optimizer just sees that you're selecting DisplayText as well - so for each of the 20'000 rows found in the NC index, there would need to be a key lookup into the clustered index to get that data - and key lookups are expensive operations! So in the end, it might just be easier and more efficient to scan the clustere index right away.
I bet if you run this query here:
select CaseActionId, Executed, ExecutedBy
from CaseActions
where Executed='2500-01-01'
then the NC index will be used
If you really need the DisplayText and that's a query you'll run frequently, maybe you should include that column in the index as an extra column in the leaf level:
DROP INDEX [IX_Actions_Executed]
CREATE NONCLUSTERED INDEX [IX_Actions_Executed]
ON [dbo].[Actions]([Executed] ASC, [ExecutedBy] ASC)
INCLUDE([DisplayText])
This would make your NC index a covering index, i.e. it could return all columns needed for your query. If you run your original query again with this covering index in place, I'm pretty sure SQL Server's query optimizer will indeed use it. The probability that any NC index will be used is significantly increased if that NC index is a covering index, e.g. some queries can get all their columns they need from just the NC index, without key lookups.
The missing index hints are a bit misleading at times - there are also known bugs leading to SQL Server Mgmt Studio to continously recommendation indices that are already in place..... don't bet too much of your money on those index hints!

Will creating index help in this case

I'm still a learning user of SQL-SERVER2005.
Here is my table structure
CREATE TABLE [dbo].[Trn_PostingGroups](
[ControlGroup] [char](5) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[PracticeCode] [char](5) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[ScanDate] [smalldatetime] NULL,
[DepositDate] [smalldatetime] NULL,
[NameOfFile] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[DepositValue] [decimal](11, 2) NULL,
[RecordStatus] [char](1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
CONSTRAINT [PK_Trn_PostingGroups_1] PRIMARY KEY CLUSTERED
(
[ControlGroup] ASC,
[PracticeCode] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
Scenario 1 : Suppose I have a query like this...
Select * from Trn_PostingGroups where PracticeCode = 'ABC'
Will indexing on Practice Code seperately help me in making my query faster??
Scenario 2 :
Select * from Trn_PostingGroups
where
ControlGroup = 12701
and PracticeCode = 'ABC'
and NameOfFile = 'FileName1'
Will indexing on NameOfFile seperately help me in making my query faster ??
If you were only selecting on the first field (ControlGroup), it is the primary sort of the clustered index and you wouldn't need to index the other field.
If you select on the other primary key fields, then adding a separate index on the other fields should help with such selects.
In general, you should index fields that are commonly used in SORT and WHERE clauses. This of course is over simplified.
See this article for more information about optimizing (statistics and query analyser).
You can only utilize one index per table per query (unless you consider self joins or CTEs). if you have multiple that can be used on the same table in the same query, then SQL Server will use statistics to determine which would be better to use.
In Scenario 1, if you create an index on PracticeCode alone, it will usually be used, as long as you have enough rows that a table scan costs more and that there is a diverse range of values in that column. An index will not be used if there are only a few rows in the table (it is faster to just look at them all). Also, an index will not be used if most of the values in that column are the same. It will not use the PK in this query, it would be like looking for a first name in the phone book, you can't use the index because it is last+first name. You might consider reversing your PK to PracticeCode+ControlGroup if you never search on ControlGroup by itself.
In Scenario 2, if you have an index on NameOfFile it will probably use the PK and ignore the NameOfFile index. Unless you make the NameOfFile index unique, and then it is a tossup. You might try to create an index (in addition to your PK) on ControlGroup+PracticeCode+NameOfFile. if you have many files per ControlGroup+PracticeCode, then it may select that index over the PK index.

Increasing performance on a logging table in SQL Server 2005

I have a "history" table where I log each request into a Web Handler on our web site. Here is the table definition:
/****** Object: Table [dbo].[HistoryRequest] Script Date: 10/09/2009 17:18:02 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[HistoryRequest](
[HistoryRequestID] [uniqueidentifier] NOT NULL,
[CampaignID] [int] NOT NULL,
[UrlReferrer] [nvarchar](512) NOT NULL,
[UserAgent] [nvarchar](512) NOT NULL,
[UserHostAddress] [nvarchar](15) NOT NULL,
[UserHostName] [nvarchar](512) NOT NULL,
[HttpBrowserCapabilities] [xml] NOT NULL,
[Created] [datetime] NOT NULL,
[CreatedBy] [nvarchar](100) NOT NULL,
[Updated] [datetime] NULL,
[UpdatedBy] [nvarchar](100) NULL,
CONSTRAINT [PK_HistoryRequest] PRIMARY KEY CLUSTERED
(
[HistoryRequestID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[HistoryRequest] WITH CHECK ADD CONSTRAINT [FK_HistoryRequest_Campaign] FOREIGN KEY([CampaignID])
REFERENCES [dbo].[Campaign] ([CampaignId])
GO
ALTER TABLE [dbo].[HistoryRequest] CHECK CONSTRAINT [FK_HistoryRequest_Campaign]
GO
37 seconds for 1050 rows on this statement:
SELECT *
FROM HistoryRequest AS hr
WHERE Created > '10/9/2009'
ORDER BY Created DESC
Does anyone have anysuggestions for speeding this up? I have a Clustered Index on the PK and a regular Index on the CREATED column. I tried a Unique Index and it barfed complaining there is a duplicate entry somewhere - which can be expected.
Any insights are welcome!
You are requesting all columns (*) over a non-covering index (created). On a large data set you are guaranteed to hit the Index Tipping Point where the clustered index scan is more efficient than an nonclustered index range seek and bookmark lookup.
Do you need * always? If yes, and if the typical access pattern is like this, then you must organize the table accordingly and make Created the leftmost clustered key.
If not, then consider changing your query to a coverable query, eg. select only HistoryRequestID and Created, which are covered by the non clustered index. If more fields are needed, add them as included columns to the non-clustered index, but take into account that this will add extra strorage space and IO log write time.
Hey, I've seen some odd behavior when pulling XML columns in large sets. Try putting your index on Created back, then specify the columns in your select statement; but omit the XML. See how that affects the return time for results.
For a log table, you probably don't need a uniqueidentifier column. You're not likely to query on it either, so it's not a good candidate for a clustered index. Your sample query is on "Created", yet there's no index on it. If you query frequently on ranges of "Created" values then it would be a good candidate for clustering even though it's not necessarily unique.
OTOH, the foreign key suggests frequent querying by Campaign, in which case having the clustering done by that column could make sense, and would also probably do a better job of scattering the inserted keys in the indexes - both the surrogate key and the timestamp would add records in sequential order, which is net more work over time for insertions because the node sectors are filled less randomly.
If it's just a log table, why does it have update audit columns? It would normally be write-only.
Rebuild indexes. Use WITH (NOLOCK) clause after the table names where appropriate, this probably applies if you want to run long(ish) running queries against table that are heavily used in a live environment (such as a log file). It basically means your query migth miss some of teh very latest records but you also aren't holding a lock open on the table - which creates additional overhead.