Data Access from single table in sql server 2005 is too slow - sql

Following is the script of table. Accessing data from this table is too slow.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Emails](
[id] [int] IDENTITY(1,1) NOT NULL,
[datecreated] [datetime] NULL CONSTRAINT [DF_Emails_datecreated]
DEFAULT (getdate()),
[UID] [nvarchar](250) COLLATE Latin1_General_CI_AS NULL,
[From] [nvarchar](100) COLLATE Latin1_General_CI_AS NULL,
[To] [nvarchar](100) COLLATE Latin1_General_CI_AS NULL,
[Subject] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,
[Body] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,
[HTML] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,
[AttachmentCount] [int] NULL,
[Dated] [datetime] NULL
) ON [PRIMARY]
Following query takes 50 seconds to fetch data.
select id, datecreated, UID, [From], [To], Subject, AttachmentCount,
Dated from emails
If I include Body and Html in select then time is event worse.
indexes are on:
id unique clustered
From Non unique non clustered
To Non unique non clustered
Tabls has currently 180000+ records.
There might be 100,000 records each month so this will become more slow as time will pass.
Does splitting data into two table will solve the problem?
What other indexes should be there?

It's almost certainly the volume of the data that's causing a problem. Because of this, you should not fetch the Subject column until you are need it. Even fetching SUBSTRING(Subject, 100) may be noticeably faster.
This may be irrelevant, but older versions of SQL Server suffered if the BLOB columns weren't the last in the row, so just as an experiment I'd move [AttachmentCount] and [Dated] above the three nvarchar(max) columns.

Related

Index seek and Table scan takes the same time to execute

I have a table with the structure like this:
CREATE TABLE [dbo].[User]
(
[Id] [INT] IDENTITY(1,1) NOT NULL,
[CountryCode] [NVARCHAR](2) NOT NULL DEFAULT (N'GB'),
[CreationDate] [DATETIME2](7) NOT NULL,
[Email] [NVARCHAR](256) NULL,
[EmailConfirmed] [BIT] NOT NULL,
[FirstName] [NVARCHAR](MAX) NOT NULL,
[LastName] [NVARCHAR](MAX) NOT NULL,
[LastSignIn] [DATETIME2](7) NOT NULL,
[LockoutEnabled] [BIT] NOT NULL,
[LockoutEnd] [DATETIMEOFFSET](7) NULL,
[NormalizedEmail] [NVARCHAR](256) NULL,
[NormalizedUserName] [NVARCHAR](256) NULL,
[PasswordHash] [NVARCHAR](MAX) NULL,
[SecurityStamp] [NVARCHAR](MAX) NULL,
[TimeZone] [NVARCHAR](64) NOT NULL DEFAULT (N'Europe/London'),
[TwoFactorEnabled] [BIT] NOT NULL,
[UserName] [NVARCHAR](256) NULL,
[LastInfoUpdate] [DATETIME] NOT NULL
)
I have around a million rows in that table, and I want to apply a nonclustered index to the [LastInfoUpdate] column.
So I've created a non-clustered index using this command:
CREATE NONCLUSTERED INDEX IX_ProductVendor_VendorID1
ON [dbo].[TestUsers] (LastInfoUpdate)
INCLUDE(Email)
And once I'm trying to run simple query like that:
SELECT [LastInfoUpdate]
FROM [dbo].[TestUsers]
WHERE [LastInfoUpdate] >= GETUTCDATE()
I just get the same result in timing as without index. According to SQL Server Profiler with db does index seek while using index and just use less cpu resources in comparison with case without index but what is important for me it's time. What time the same? What am I doing wrong?
Execution Plan of table scan
Execution plan of Index Scan
Index seek Execution Plan file
Just create the following index:
CREATE INDEX IX_Users_EventDate ON Users(EventDate)
INCLUDE (EventId)
And the following query will be fast:
SELECT EventId, EventDate
FROM Users
WHERE EventDate <= GETUTCDATE()
Because the index is a covering index.
The key of a covering index must include columns referenced in WHERE and ORDER BY clauses. And the covering index must include all columns referenced on the SELECT list.
The query you posted doesn't match to the query plans you linked. The query plans are for the above query.
Another thing to take into account is the number of records returned by the query. If they are many, the query cannot be fast enough, because it needs to read all the data and send it to the network.
Try to use ColumnStore index. It is faster when you want to get some range of columns:
CREATE NONCLUSTERED COLUMNSTORE INDEX
[csi_User_LastInfoUpdate_Email] ON
[dbo].[User] ( [LastInfoUpdate], [Email] )WITH
(DROP_EXISTING = OFF, COMPRESSION_DELAY = 0) ON [PRIMARY]
An article about column store index.
"WHERE [LastInfoUpdate] >= GETUTCDATE()" might return a lot of results. Tablescan can in this case be faster than index seek and subsequent adding information from the tabledata.
By adding the queried information to the index you can avoid the costly subsequent looks into tabledata.

Quick SELECT sometimes time out

I have stored procedure which execute simple select. Any time I run it manually, it runs under the second. But in production (SQL Azure S2 database) it runs inside scheduled task every 12 ours - so I think it is reasonable to expect it to run every time with "cold" - with no cached data. And the performance is very unpredictable - sometimes it takes 5 second, sometimes 30 and sometimes even 100.
The select is optimized to the maximum (of my knowledge, anyway) - I created filtered index including all the columns returned from SELECT, so the only operation in execution plan is Index scan. There is huge difference between estimated and actual rows:
But overall the query seems pretty lightweight. I do not blame environment (SQL Azure) because there is A LOT of queries executing all the time, and this one is the only one with this performance problem.
Here is XML execution plan for SQL ninjas willing to help : http://pastebin.com/u5GCz0vW
EDIT:
Table structure:
CREATE TABLE [myproject].[Purchase](
[Id] [int] IDENTITY(1,1) NOT NULL,
[ProductId] [nvarchar](50) NOT NULL,
[DeviceId] [nvarchar](255) NOT NULL,
[UserId] [nvarchar](255) NOT NULL,
[Receipt] [nvarchar](max) NULL,
[AppVersion] [nvarchar](50) NOT NULL,
[OSType] [tinyint] NOT NULL,
[IP] [nchar](15) NOT NULL,
[CreatedOn] [datetime] NOT NULL,
[ValidationState] [smallint] NOT NULL,
[ValidationInfo] [nvarchar](max) NULL,
[ValidationError] [nvarchar](max) NULL,
[ValidatedOn] [datetime] NULL,
[PurchaseId] [nvarchar](255) NULL,
[PurchaseDate] [datetime] NULL,
[ExpirationDate] [datetime] NULL,
CONSTRAINT [PK_Purchase] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
Index definition:
CREATE NONCLUSTERED INDEX [IX_AndroidRevalidationTargets3] ON [myproject].[Purchase]
(
[ExpirationDate] ASC,
[ValidatedOn] ASC
)
INCLUDE ( [ProductId],
[DeviceId],
[UserId],
[Receipt],
[AppVersion],
[OSType],
[IP],
[CreatedOn],
[ValidationState],
[ValidationInfo],
[ValidationError],
[PurchaseId],
[PurchaseDate])
WHERE ([OSType]=(1) AND [ProductId] IS NOT NULL AND [ProductId]<>'trial' AND ([ValidationState] IN ((1), (0), (-2))))
Data can be considered sensitive, so I cant provide sample.
Since your query returns only 1 match, I think you should trim down your index to a bare minimum. You can get the remaining columns via a Key Lookup from the clustered index:
CREATE NONCLUSTERED INDEX [IX_AndroidRevalidationTargets3] ON [myproject].[Purchase]
(
[ExpirationDate] ASC,
[ValidatedOn] ASC
)
WHERE ([OSType]=(1) AND [ProductId] IS NOT NULL AND [ProductId]<>'trial' AND ([ValidationState] IN ((1), (0), (-2))))
This doesn't eliminate the scan, but it makes the index much leaner for a fast read.
Edit: OP stated that the slimmed-down index was ignored by SQL Server. You can force SQL Server to use the filter index:
SELECT *
FROM [myproject].[Purchase] WITH (INDEX(IX_AndroidRevalidationTargets3))

Implementing custom fields in a database for large numbers of records

I'm developing an app which requires a user defined custom fields on a contacts table. This contact table can contain many millions of contacts.
We're looking at using a secondary metadata table which stores information about the fields, along with a tertiary value table which stores the actual data.
Here's the rough schema:
CREATE TABLE [dbo].[Contact](
[ID] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [nvarchar](max) NULL,
[MiddleName] [nvarchar](max) NULL,
[LastName] [nvarchar](max) NULL,
[Email] [nvarchar](max) NULL
)
CREATE TABLE [dbo].[CustomField](
[ID] [int] IDENTITY(1,1) NOT NULL,
[FieldName] [nvarchar](50) NULL,
[Type] [varchar](50) NULL
)
CREATE TABLE [dbo].[ContactAndCustomField](
[ID] [int] IDENTITY(1,1) NOT NULL,
[ContactID] [int] NULL,
[FieldID] [int] NULL,
[FieldValue] [nvarchar](max) NULL
)
However, this approach introduces a lot of complexity, particularly with regard to importing CSV files with multiple custom fields. At the moment this requires a update/join statement and a separate insert statement for every individual custom field. Joins would also be required to return custom field data for multiple rows at once
I've argued for this structure instead:
CREATE TABLE [dbo].[Contact](
[ID] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [nvarchar](max) NULL,
[MiddleName] [nvarchar](max) NULL,
[LastName] [nvarchar](max) NULL,
[Email] [nvarchar](max) NULL
[CustomField1] [nvarchar](max) NULL
[CustomField2] [nvarchar](max) NULL
[CustomField3] [nvarchar](max) NULL /* etc, adding lots of empty fields */
)
CREATE TABLE [dbo].[ContactCustomField](
[ID] [int] IDENTITY(1,1) NOT NULL,
[FieldIndex] [int] NULL,
[FieldName] [nvarchar](50) NULL,
[Type] [varchar](50) NULL
)
The downside of this second approach is that there is a finite number of custom fields that must be specified when the contacts table is created. I don't think that's a major hurdle given the performance benefits it will surely have when importing large CSV files, and returning result sets.
What approach is the most efficient for large numbers of rows? Are there any downsides to the second technique that I'm not seeing?
Microsoft introduced sparse columns exactly for this type of problems. Tha point is that in a "classic" design you end up with large number of columns, most of the NULLs for any particular row. Same here with sparse columns, but NULLs don't require any storage. Moreover, you can create sets of columns and modify sets with XML.
Performance- and storage-wise, sparse columns are the winner.
http://technet.microsoft.com/en-us/library/cc280604.aspx
uery performance. Query performance for any "property bag table" approach is funny and comically slow - but if you need flexibility you can either have a dynamic table that is changed via an editor OR you have a property bag table. So when you need it, you need it.
But expect the performance to be slow.
The best approach would likely be a ContactCustomFields table which has - fields that are determined by an editor.

SQL query not returning record

I am trying to retrieve a record from a table with a given field value. The query is:
declare #imei varchar(50)
set #imei = 'ee262b57-ccb4-4a2b-8410-6d8621fd9328'
select *
from tblDevices
where imei = #imei
which returns nothing.
If I comment out the where clause all records are returned, including the one I am looking for. The value is clearly in the table field and matches exactly, but I cannot get the where clause to work.
I literally copied the value out of the table to ensure it was correct.
I would appreciate any guidance on my mistake.
Table def:
CREATE TABLE [dbo].[tblDevices](
[id] [int] IDENTITY(1,1) NOT NULL,
[create_date] [datetime] NOT NULL,
[update_date] [datetime] NOT NULL,
[other_id] [int] NULL,
[description] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[authorized] [int] NOT NULL,
[imei] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
CONSTRAINT [PK_tblDevices] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
Edit
Using user2864740 suggestion, I queried the following:
select hashbytes('SHA1', imei) as h1 from tblDevices where id =8
returns:
0x43F9067C174B2F2F2C0FFD17B9AC7F54B3C630A2
select hashbytes('SHA1', #imei) as h2
returns:
0xB9B82BB440B04729B2829B335E6D6B450572D2AB
So, I am not sure what this means. My poor little brain is having a hard time understanding that A <> A?! What is going on here if it's not a collation issue? How can two identical values not be considered equal?
Edit 2
this is the table record I want:
8 2013-10-22 12:43:10.223 2013-10-22 12:43:10.223 -1 1 ee262b57-ccb4-4a2b-8410-6d8621fd9328
Kinda of taking a wild stab but with the two hashes showing they are in fact different, wondering if you just have an extra space somewhere Maybe try:
select *
from tblDevices
where Trim(imei) = (#imei)

Table size strange in SQL Server

I have 2 tables in SQL Server
TbUrl
INDEX SPACE 12,531 MB
ROW COUNT 247505
DATA SPACE 1.965,891 MB
Table structure:
CREATE TABLE [TbUrl](
[IdUrl] [Int] IDENTITY(1,1) NOT NULL,
[IdSupply] [Int] NOT NULL,
[Uri] [varchar](512) NOT NULL,
[UrlCod] [varchar](256) NOT NULL,
[Status] [Int] NOT NULL,
[InsertionDate] [datetime] NOT NULL,
[UpdatedDate] [datetime] NULL,
[UpdatedIp] [varchar](15) NULL
TbUrlDetail
INDEX SPACE 29,406 MB
ROW COUNT 234209
DATA SPACE 386,047 MB
Structure:
CREATE TABLE .[TbUrlDetail](
[IdUrlDetail] [Int] IDENTITY(1,1) NOT NULL,
[IdUri] [Int] NOT NULL,
[Title] [varchar](512) NOT NULL,
[Sku] [varchar](32) NOT NULL,
[MetaKeywords] [varchar](512) NOT NULL,
[MetaDescription] [varchar](512) NOT NULL,
[Price] [money] NOT NULL,
[Description] [text] NOT NULL,
[Stock] [Bit] NOT NULL,
[StarNumber] [Int] NOT NULL,
[ReviewNumber] [Int] NOT NULL,
[Category] [varchar](256) NOT NULL,
[UrlShort] [varchar](32) NULL,
[ReleaseDate] [datetime] NOT NULL,
[InsertionDate] [datetime] NOT NULL
The size of TbUrl is very large compared with TbUrlDetail
The layout (design) of table TbUrl is less compared with TbUrlDetail but the data space it's else.
I´ve done SHRINK ON DATABASE but the space of TbUrl doesn't reduce.
What might be happening? How do I decrease the space of this table?
Is there a clustered index on the table? (If not you could be suffering from a lot of forward pointers - ref.) Have you made drastic changes to the data or the data types or added / dropped columns? (If you have then a lot of the space previously occupied may not be able to be re-used. One ref where changing a fixed-length col to variable does not reclaim space.)
In both cases you should be able to recover the wasted space by rebuilding the table (which will also rebuild all of the clustered indexes):
ALTER TABLE dbo.TblUrl REBUILD;
If you are on Enterprise Edition you can do this online:
ALTER TABLE dbo.TblUrl REBUILD WITH (ONLINE = ON);
Shrinking the entire database is not the magic answer here. And if there is no clustered index on this table, I strongly suggest you consider one before performing the rebuild.
With VARCHAR() fields, the amount of space actually taken does vary according to the amount of text put in those fields.
Could you perhaps have (on average) much shorter entries in one table than in the other?
Try
SELECT
SUM(CAST(LENGTH(uri) + LENGTH(urlcod) AS BIGINT)) AS character_count
FROM
TbUrl
SELECT
SUM(CAST(LENGTH(title) + LENGTH(metakeywords) + LENGTH(metadescription) + LENGTH(Category) AS BIGINT)) AS character_count
FROM
TbUrlDetail