There is a SQL table which is growing rapidly and inconsistently compared to it's intrinsic data. To make it short, there is a windows service backing up the content of .txt files in this table, the files weight from 1KB to 45KB approx. hence the nvarchar(max) column used to store the content of those text files.
When running the sp_spaceused command on this table, here is the result:
name rows reserved data index_size unused
Files 20402 814872 KB 813416 KB 1048 KB 408 KB
But when running this simple query, which gives me the total amount of data in bytes used by this table, the result is not anywhere near: (97231108 bytes).
SELECT (SUM(DATALENGTH(A)) +
SUM(DATALENGTH(B)) +
SUM(DATALENGTH(C)) +
SUM(DATALENGTH(D)) +
SUM(DATALENGTH(E)) +
SUM(DATALENGTH(F)) +
SUM(DATALENGTH(G)) +
SUM(DATALENGTH(H)) +
SUM(DATALENGTH(I))) AS BytesUsed
FROM Files
RESULT: 97231108 bytes
The create statement for this table goes like this:
CREATE TABLE [dbo].[Files](
[A] [int] IDENTITY(33515427,1) NOT NULL,
[B] [nvarchar](100) NOT NULL,
[C] [nvarchar](max) NOT NULL,
[D] [nvarchar](100) NOT NULL,
[E] [datetime] NULL,
[F] [nvarchar](2) NULL,
[G] [datetime] NULL,
[H] [nvarchar](100) NULL,
[I] [int] NULL,
CONSTRAINT [PK_Files] PRIMARY KEY CLUSTERED
(
[A] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
CONSTRAINT [UK_Files_FileType_FileDate] UNIQUE NONCLUSTERED
(
[D] ASC,
[E] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[Files] WITH CHECK ADD CONSTRAINT [FK_Files_FileStatus] FOREIGN
KEY([F])
REFERENCES [dbo].[F] ([F])
GO
ALTER TABLE [dbo].[Files] CHECK CONSTRAINT [FK_Files_FileStatus]
GO
Temporary Fix: I have recreated the table (DROP & CREATE), then copied the old table's data into the new one, this made the table go from 65GB to 108MB.
My question is:
What can make this table taking so much space and how can I prevent it from growing again?
Installing the latest service pack fixed the problem.
Related
Would someone help, please, to get rid of the error:
Violation of PRIMARY KEY constraint 'PK_stmp_tst1'. Cannot insert duplicate key in object 'dbo.stmp_tst'. The duplicate key value is (1).
which occurs after switch partition of the table.
Full SQL Script below:
I. Create 2 equal tables in different schemas:
CREATE TABLE dbo.stmp_tst(
[inn] [varchar](20) NULL,
[id] [bigint] IDENTITY(1,1) NOT NULL,
CONSTRAINT [PK_stmp_tst] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
go
CREATE TABLE stage.stmp_tst(
[inn] [varchar](20) NULL,
[id] [bigint] IDENTITY(1,1) NOT NULL,
CONSTRAINT [PK_stmp_tst] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
II. Insert data into stage table.
insert into stage.stmp_tst (inn)
select '1111'
III. Switch-partition stage to dbo.
alter table stage.stmp_tst switch partition 1 to dbo.stmp_tst partition 1;
IV. Add data to dbo table.
insert into dbo.stmp_tst (inn)
select '1111'
V. We have the error:
Violation of PRIMARY KEY constraint 'PK_stmp_tst1'. Cannot insert
duplicate key in object 'dbo.stmp_tst'. The duplicate key value is
(1).
IV. Drop temporary tables:
drop table dbo.stmp_tst
drop table stage.stmp_tst
It can be solved by the query:
DBCC CHECKIDENT ('dbo.stmp_tst', RESEED);
but reseeding takes time.
Is it possible to do a switch-partition correctly without reseed?
Thank you.
I have a table Log:
CREATE TABLE [dbo].[Log]
(
[Id] [INT] IDENTITY(1,1) NOT NULL,
[Date] [DATETIME] NOT NULL,
[Thread] [VARCHAR](255) NOT NULL,
[Level] [VARCHAR](50) NOT NULL,
[Logger] [VARCHAR](255) NOT NULL,
[Message] [VARCHAR](4000) NOT NULL,
[Exception] [VARCHAR](2000) NULL,
CONSTRAINT [PK_Log]
PRIMARY KEY NONCLUSTERED ([Id] ASC)
)
The PK is Id, and we partitioned column [Date] after create an index on it and change PK to non-clustered:
ALTER TABLE [dbo].[Log]
ADD CONSTRAINT [PK_Log]
PRIMARY KEY NONCLUSTERED ([Id] ASC) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
CREATE CLUSTERED INDEX [IX_Log_Date]
ON [dbo].[Log]([Date] ASC) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
The partitions are created successfully.
Now, we want to use Truncate to remove partitions:
TRUNCATE TABLE [dbo].[Log]
WITH (PARTITIONS (1 TO 2));
But get this error:
TRUNCATE TABLE statement failed. Index 'PK_Log' is not partitioned, but table 'Log' uses partition function 'myDateRangePF'. Index and table must use an equivalent partition function.
Does this mean partitioned table can only have one index? if the existing table has multiple index, in order to truncate it, we have to remove all indexes first?
Thanks
The issue is that you created the index PK_Log...ON [PRIMARY], which made it a non-partitioned index on a partitioned table. You'll need to drop that index (and any other non-partitioned indexes, probably) and recreate it. Either specify the partitioning filegroup explicitly, or leave the ON clause out and let SQL Server pick the filegroup. By default, it will create the index on the same filegroup as the underlying table and with the same partitioning as the table.
See Partitioned Indexes in BOL for additional information.
I am suffering from horrendous performance issues using a Azure Sql DB.
Its one table in particular with the following schema:
CREATE TABLE [dbo].[RawTwitter](
[Id] [int] IDENTITY(1,1) NOT NULL,
[IsError] [bit] NULL,
[ErrorDescription] [nvarchar](max) NULL,
[IsProcessed] [bit] NULL,
[IsRunResult] [bit] NULL,
[RawJson] [nvarchar](max) NULL,
The RawJson field is the culprit. If I do a SELECT that contains that field my query takes minutes - like 20 minutes! If I take that column out of the query its instant. There are only about 45,000 records in that table!
I'm not getting this issue on local so I scripted that table and found that it differed from that on Azure. On local, the following is appended to the table script:
CONSTRAINT [PK_RawTwitter] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
but on Azure version the TEXTIMAGE_ON [PRIMARY] is omitted. Having done some more digging it would appear that Azure does not allow that last keyword? And it would appear that this affects how large text fields are stored.
This would seem to be the obvious reason there is such a big difference in performance between my local and staging. What other things can I try to get around this performance nightmare?
I have problem with my indexes on two tables.
Here is code for creating the tables:
CREATE TABLE [dbo].[Table]
(
[ID] [uniqueidentifier] NOT NULL,
[IP] [nvarchar](15) NULL,
[Referrer] [nvarchar](1000) NULL,
[Domain] [nvarchar](100) NULL,
[RegID] [int] NULL,
[Agent] [nvarchar](500) NULL,
CONSTRAINT [PK_Table] PRIMARY KEY CLUSTERED
([ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Table]
ADD CONSTRAINT [DF_Table_ID] DEFAULT (newsequentialid()) FOR [ID]
GO
And index
CREATE NONCLUSTERED INDEX [Reg_ID] ON [dbo].[Table]
(
[RegID] ASC
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
And another table with index
CREATE TABLE [dbo].[Table2]
(
[Table2_ID] [int] IDENTITY(1,1) NOT NULL,
[TracID] [uniqueidentifier] NOT NULL,
[F_URL] [nvarchar](1500) NULL,
[S_URL] [nvarchar](100) NULL,
[Time] [datetime] NULL,
CONSTRAINT [PK_Table2] PRIMARY KEY CLUSTERED ([Table2_ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Table2] WITH CHECK
ADD CONSTRAINT [FK_Table2_Table]
FOREIGN KEY([TracID]) REFERENCES [dbo].[Table] ([Web_Visitor_ID])
GO
ALTER TABLE [dbo].[Table2] CHECK CONSTRAINT [FK_Table2_Table]
GO
Index
CREATE NONCLUSTERED INDEX [IX_TracID] ON [dbo].[Table2]
(
[TracID] ASC
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
In first table I have about 6M rows and in second 8M rows (a couple of thousand each day).
I have problem because indexes are fragmented up to 99% in 4 hours.
I run query (sys.columns) to get size in bytes and there are results
Table 1 Table 2
name bytes name bytes
ID 16 ID 4
IP 30 TracID 16
Referrer 2000 F_URL 3000
Domain 200 S_URL 200
RegID 4 Time 8
Agent 1000
Does anyone have some idea witch can help me to fix that fragmentation ?
Are you SURE you need to defragment? With proper hardware, fragmentation rarely matters anymore. Many old-school SQL people still recommend it, but in reality in most cases it is a relic of a past.
There are two reasons it has become irrelevant. First, all reads should be cached in RAM (if not, you need more RAM--it's cheap and will give you WAY more bang for the buck than effort spent defragmenting). Second, SSDs eliminate seeks times anyway, so fragmentation is irrelevant. As a result of these two changes, the time spent defragmenting is usually wasted.
I have implemented filestream in an existing database on SQL Server 2008 r2.
Now I have a very urgent problem as my site is practically down now:
With a very simple table like this:
CREATE TABLE [dbo].[Table1](
[Id] [int] IDENTITY(1,1),
[rowguid] [uniqueidentifier] ROWGUIDCOL NOT NULL,
[Image] [varbinary](max) FILESTREAM NULL,
CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
CONSTRAINT [Table1RowguidUnique] UNIQUE NONCLUSTERED
(
[rowguid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
ALTER TABLE [dbo].[Table1] ADD CONSTRAINT [DF_table1_rowguid] DEFAULT (newid()) FOR [rowguid]
GO
ALTER TABLE dbo.Table1
SET ( FILESTREAM_ON = fsfg_LiveWebsite )
GO
If I run:
select * from Table1 where Id = 1
it runs very quickly and give the correct result.
If I run anything with the "Varbinary(max) FILESTREAM" field in the where clause the whole table locks down.
So for example any of those 2 queries:
select Id from Table1 where Id = 1 and [Image] is null
select Id from Table1 where Id = 1 and [Image] = convert(varbinary(max), 'a')
What could this be?
Please reply asap with any suggestion!
Thank you
First and foremost if you want to query a VARBINARY column you'll need to enable and use Full-Text Search.
Reference Articles.
http://technet.microsoft.com/en-us/library/ms142531.aspx (this article will give you an overview of Full-Text Search and the VARBINARY column)
http://technet.microsoft.com/en-us/library/ms187787.aspx (this article will show you how to use the CONTAINS T-SQL command to query the field)