I have 2 tables in SQL Server
TbUrl
INDEX SPACE 12,531 MB
ROW COUNT 247505
DATA SPACE 1.965,891 MB
Table structure:
CREATE TABLE [TbUrl](
[IdUrl] [Int] IDENTITY(1,1) NOT NULL,
[IdSupply] [Int] NOT NULL,
[Uri] [varchar](512) NOT NULL,
[UrlCod] [varchar](256) NOT NULL,
[Status] [Int] NOT NULL,
[InsertionDate] [datetime] NOT NULL,
[UpdatedDate] [datetime] NULL,
[UpdatedIp] [varchar](15) NULL
TbUrlDetail
INDEX SPACE 29,406 MB
ROW COUNT 234209
DATA SPACE 386,047 MB
Structure:
CREATE TABLE .[TbUrlDetail](
[IdUrlDetail] [Int] IDENTITY(1,1) NOT NULL,
[IdUri] [Int] NOT NULL,
[Title] [varchar](512) NOT NULL,
[Sku] [varchar](32) NOT NULL,
[MetaKeywords] [varchar](512) NOT NULL,
[MetaDescription] [varchar](512) NOT NULL,
[Price] [money] NOT NULL,
[Description] [text] NOT NULL,
[Stock] [Bit] NOT NULL,
[StarNumber] [Int] NOT NULL,
[ReviewNumber] [Int] NOT NULL,
[Category] [varchar](256) NOT NULL,
[UrlShort] [varchar](32) NULL,
[ReleaseDate] [datetime] NOT NULL,
[InsertionDate] [datetime] NOT NULL
The size of TbUrl is very large compared with TbUrlDetail
The layout (design) of table TbUrl is less compared with TbUrlDetail but the data space it's else.
I´ve done SHRINK ON DATABASE but the space of TbUrl doesn't reduce.
What might be happening? How do I decrease the space of this table?
Is there a clustered index on the table? (If not you could be suffering from a lot of forward pointers - ref.) Have you made drastic changes to the data or the data types or added / dropped columns? (If you have then a lot of the space previously occupied may not be able to be re-used. One ref where changing a fixed-length col to variable does not reclaim space.)
In both cases you should be able to recover the wasted space by rebuilding the table (which will also rebuild all of the clustered indexes):
ALTER TABLE dbo.TblUrl REBUILD;
If you are on Enterprise Edition you can do this online:
ALTER TABLE dbo.TblUrl REBUILD WITH (ONLINE = ON);
Shrinking the entire database is not the magic answer here. And if there is no clustered index on this table, I strongly suggest you consider one before performing the rebuild.
With VARCHAR() fields, the amount of space actually taken does vary according to the amount of text put in those fields.
Could you perhaps have (on average) much shorter entries in one table than in the other?
Try
SELECT
SUM(CAST(LENGTH(uri) + LENGTH(urlcod) AS BIGINT)) AS character_count
FROM
TbUrl
SELECT
SUM(CAST(LENGTH(title) + LENGTH(metakeywords) + LENGTH(metadescription) + LENGTH(Category) AS BIGINT)) AS character_count
FROM
TbUrlDetail
Related
I've been having this problem with my database where it kept on incrementing the id column even though it has been removed. To better understand what I meant, here is a screenshot of my gridview:
As you can see from the id column, everything is fine from 11 - 16. but it suddenly skipped from 25 - 27. What i want to happen is, when i remove an item, i want it to start from the last id which is 16. So the next id should be 17. I hope this makes sense for you guys.
Here is also part of the SQL script:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[guitarItems]
(
[id] [int] IDENTITY(1,1) NOT NULL,
[type] [varchar](50) NOT NULL,
[brand] [varchar](50) NOT NULL,
[model] [varchar](50) NOT NULL,
[price] [float] NOT NULL,
[itemimage1] [varchar](255) NULL,
[itemimage2] [varchar](255) NULL,
[description] [text] NOT NULL,
[necktype] [varchar](100) NOT NULL,
[body] [varchar](100) NOT NULL,
[fretboard] [varchar](100) NOT NULL,
[fret] [varchar](50) NOT NULL,
[bridge] [varchar](100) NOT NULL,
[neckpickup] [varchar](100) NOT NULL,
[bridgepickup] [varchar](100) NOT NULL,
[hardwarecolor] [varchar](50) NOT NULL,
PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
You can use:
DBCC CHECKIDENT ("YourTableNameHere", RESEED, 1);
Before using it, visit link: https://learn.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-checkident-transact-sql
Primary autoincrement keys in the database are used to uniquely identify a given row and shouldn't be given any business meaning. So leave the primary key as it is and add another column like guitarItemsId. Then when you delete a record from the database you may want to send an additional UPDATE statement in order to decrease the guitarItemsId column of all rows that have the guitarItemsId greater than the one you are currently deleting.
Also, remember that you should never modify the value of a primary key in a relational database because there could be other tables that reference it as a foreign key and modifying it might violate the referential constraints.
I have stored procedure which execute simple select. Any time I run it manually, it runs under the second. But in production (SQL Azure S2 database) it runs inside scheduled task every 12 ours - so I think it is reasonable to expect it to run every time with "cold" - with no cached data. And the performance is very unpredictable - sometimes it takes 5 second, sometimes 30 and sometimes even 100.
The select is optimized to the maximum (of my knowledge, anyway) - I created filtered index including all the columns returned from SELECT, so the only operation in execution plan is Index scan. There is huge difference between estimated and actual rows:
But overall the query seems pretty lightweight. I do not blame environment (SQL Azure) because there is A LOT of queries executing all the time, and this one is the only one with this performance problem.
Here is XML execution plan for SQL ninjas willing to help : http://pastebin.com/u5GCz0vW
EDIT:
Table structure:
CREATE TABLE [myproject].[Purchase](
[Id] [int] IDENTITY(1,1) NOT NULL,
[ProductId] [nvarchar](50) NOT NULL,
[DeviceId] [nvarchar](255) NOT NULL,
[UserId] [nvarchar](255) NOT NULL,
[Receipt] [nvarchar](max) NULL,
[AppVersion] [nvarchar](50) NOT NULL,
[OSType] [tinyint] NOT NULL,
[IP] [nchar](15) NOT NULL,
[CreatedOn] [datetime] NOT NULL,
[ValidationState] [smallint] NOT NULL,
[ValidationInfo] [nvarchar](max) NULL,
[ValidationError] [nvarchar](max) NULL,
[ValidatedOn] [datetime] NULL,
[PurchaseId] [nvarchar](255) NULL,
[PurchaseDate] [datetime] NULL,
[ExpirationDate] [datetime] NULL,
CONSTRAINT [PK_Purchase] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
Index definition:
CREATE NONCLUSTERED INDEX [IX_AndroidRevalidationTargets3] ON [myproject].[Purchase]
(
[ExpirationDate] ASC,
[ValidatedOn] ASC
)
INCLUDE ( [ProductId],
[DeviceId],
[UserId],
[Receipt],
[AppVersion],
[OSType],
[IP],
[CreatedOn],
[ValidationState],
[ValidationInfo],
[ValidationError],
[PurchaseId],
[PurchaseDate])
WHERE ([OSType]=(1) AND [ProductId] IS NOT NULL AND [ProductId]<>'trial' AND ([ValidationState] IN ((1), (0), (-2))))
Data can be considered sensitive, so I cant provide sample.
Since your query returns only 1 match, I think you should trim down your index to a bare minimum. You can get the remaining columns via a Key Lookup from the clustered index:
CREATE NONCLUSTERED INDEX [IX_AndroidRevalidationTargets3] ON [myproject].[Purchase]
(
[ExpirationDate] ASC,
[ValidatedOn] ASC
)
WHERE ([OSType]=(1) AND [ProductId] IS NOT NULL AND [ProductId]<>'trial' AND ([ValidationState] IN ((1), (0), (-2))))
This doesn't eliminate the scan, but it makes the index much leaner for a fast read.
Edit: OP stated that the slimmed-down index was ignored by SQL Server. You can force SQL Server to use the filter index:
SELECT *
FROM [myproject].[Purchase] WITH (INDEX(IX_AndroidRevalidationTargets3))
I'm developing an app which requires a user defined custom fields on a contacts table. This contact table can contain many millions of contacts.
We're looking at using a secondary metadata table which stores information about the fields, along with a tertiary value table which stores the actual data.
Here's the rough schema:
CREATE TABLE [dbo].[Contact](
[ID] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [nvarchar](max) NULL,
[MiddleName] [nvarchar](max) NULL,
[LastName] [nvarchar](max) NULL,
[Email] [nvarchar](max) NULL
)
CREATE TABLE [dbo].[CustomField](
[ID] [int] IDENTITY(1,1) NOT NULL,
[FieldName] [nvarchar](50) NULL,
[Type] [varchar](50) NULL
)
CREATE TABLE [dbo].[ContactAndCustomField](
[ID] [int] IDENTITY(1,1) NOT NULL,
[ContactID] [int] NULL,
[FieldID] [int] NULL,
[FieldValue] [nvarchar](max) NULL
)
However, this approach introduces a lot of complexity, particularly with regard to importing CSV files with multiple custom fields. At the moment this requires a update/join statement and a separate insert statement for every individual custom field. Joins would also be required to return custom field data for multiple rows at once
I've argued for this structure instead:
CREATE TABLE [dbo].[Contact](
[ID] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [nvarchar](max) NULL,
[MiddleName] [nvarchar](max) NULL,
[LastName] [nvarchar](max) NULL,
[Email] [nvarchar](max) NULL
[CustomField1] [nvarchar](max) NULL
[CustomField2] [nvarchar](max) NULL
[CustomField3] [nvarchar](max) NULL /* etc, adding lots of empty fields */
)
CREATE TABLE [dbo].[ContactCustomField](
[ID] [int] IDENTITY(1,1) NOT NULL,
[FieldIndex] [int] NULL,
[FieldName] [nvarchar](50) NULL,
[Type] [varchar](50) NULL
)
The downside of this second approach is that there is a finite number of custom fields that must be specified when the contacts table is created. I don't think that's a major hurdle given the performance benefits it will surely have when importing large CSV files, and returning result sets.
What approach is the most efficient for large numbers of rows? Are there any downsides to the second technique that I'm not seeing?
Microsoft introduced sparse columns exactly for this type of problems. Tha point is that in a "classic" design you end up with large number of columns, most of the NULLs for any particular row. Same here with sparse columns, but NULLs don't require any storage. Moreover, you can create sets of columns and modify sets with XML.
Performance- and storage-wise, sparse columns are the winner.
http://technet.microsoft.com/en-us/library/cc280604.aspx
uery performance. Query performance for any "property bag table" approach is funny and comically slow - but if you need flexibility you can either have a dynamic table that is changed via an editor OR you have a property bag table. So when you need it, you need it.
But expect the performance to be slow.
The best approach would likely be a ContactCustomFields table which has - fields that are determined by an editor.
i have created a table where i used unique identifier(GUID) as a primary key of table. Now i need to create a indexing on my table which one will be best for me..i am going to use this table for error logging.
Following is my table structure
CREATE TABLE [dbo].[errors](
[error_id] [uniqueidentifier] NOT NULL,
[assembly_name] [varchar](50) NULL,
[method_name] [varchar](50) NULL,
[person_id] [int] NULL,
[timestamp] [datetime] NULL,
[description] [varchar](max) NULL,
[parameter_list] [varchar](max) NULL,
[exception_text] [nvarchar](max) NULL)
So which table i use as a primary key and index.
Thanks in advance.
You can use that as PK but not good if you use it as clustered index.In that case the GUID will be copied in all the nc index keys and thus makes them much wider and could cause performance issue.Also, this might cause page spilts which is no good.Wide indexes means more space will be used.If you have used GUID to avoid the last page contention issues try to use some sort of hashing technique to make sure that data goes on diff pages.But in that case you have to use same hashing while selecting form table using PK.
Following is the script of table. Accessing data from this table is too slow.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Emails](
[id] [int] IDENTITY(1,1) NOT NULL,
[datecreated] [datetime] NULL CONSTRAINT [DF_Emails_datecreated]
DEFAULT (getdate()),
[UID] [nvarchar](250) COLLATE Latin1_General_CI_AS NULL,
[From] [nvarchar](100) COLLATE Latin1_General_CI_AS NULL,
[To] [nvarchar](100) COLLATE Latin1_General_CI_AS NULL,
[Subject] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,
[Body] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,
[HTML] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL,
[AttachmentCount] [int] NULL,
[Dated] [datetime] NULL
) ON [PRIMARY]
Following query takes 50 seconds to fetch data.
select id, datecreated, UID, [From], [To], Subject, AttachmentCount,
Dated from emails
If I include Body and Html in select then time is event worse.
indexes are on:
id unique clustered
From Non unique non clustered
To Non unique non clustered
Tabls has currently 180000+ records.
There might be 100,000 records each month so this will become more slow as time will pass.
Does splitting data into two table will solve the problem?
What other indexes should be there?
It's almost certainly the volume of the data that's causing a problem. Because of this, you should not fetch the Subject column until you are need it. Even fetching SUBSTRING(Subject, 100) may be noticeably faster.
This may be irrelevant, but older versions of SQL Server suffered if the BLOB columns weren't the last in the row, so just as an experiment I'd move [AttachmentCount] and [Dated] above the three nvarchar(max) columns.