I have table with an int identity column and it is skipping id in thousands at times. Searches suggest it is normal of sql server to skip it by 1000 or 1001, but mine does increase by 20000 or more on occasions, But last time it got jumped by 95216000.
Unable to find reason why this is happening, check sql server for crash logs and any other suspicious events but no luck.
Having replication on table, is that related..??
Create Table Script is like..
CREATE TABLE [dbo].[Table](
[CId] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
.
.
.
CONSTRAINT [PK_Table] PRIMARY KEY CLUSTERED
(
[CId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]
GO
Did you checked values for this
SELECT IDENT_SEED(TABLE_NAME) AS Seed,
IDENT_INCR(TABLE_NAME) AS Increment,
IDENT_CURRENT(TABLE_NAME) AS Current_Identity,
TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE OBJECTPROPERTY(OBJECT_ID(TABLE_NAME), 'YourTableName') = 1
AND TABLE_TYPE = 'BASE TABLE'
Also by any chance you truncated the data into table?
Please refer this link to find the reason for the gaps.
http://sqlity.net/en/792/the-gap-in-the-identity-value-sequence/
Related
I have an insert statement that's throwing a primary key error but I don't see how I could possibly be inserting duplicate key values.
First I create a temp table with a primary key.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED //Note: I've tried committed and uncommited, neither materially affects the behavior. See screenshots below for proof.
IF (OBJECT_ID('TEMPDB..#P')) IS NOT NULL DROP TABLE #P;
CREATE TABLE #P(idIsbn INT NOT NULL PRIMARY KEY, price SMALLMONEY, priceChangedDate DATETIME);
Then I pull prices from the Prices table, grouping by idIsbn, which is the primary key in the temp table.
INSERT INTO #P(idIsbn, price, priceChangedDate)
SELECT idIsbn ,
MIN(lowestPrice) ,
MIN(priceChangedDate)
FROM Price p
WHERE p.idMarketplace = 3100
GROUP BY p.idIsbn
I understand that grouping by idIsbn by definition makes it unique. The idIsbn in the prices table is: [idIsbn] [int] NOT NULL.
But every once in a while when I run this query I get this error:
Violation of PRIMARY KEY constraint 'PK__#P________AED35F8119E85FC5'. Cannot insert duplicate key in object 'dbo.#P'. The duplicate key value is (1447858).
NOTE: I've got a lot of questions about timing. I will select this statement, press F5, and no error will occur. Then I'll do it again, and it will fail, then I'll run it again and again and it will succeed a couple times before it fails again. I guess what I'm saying is that I can find no pattern for when it will succeed and when it won't.
How can I be inserting duplicate rows if (A) I just created the table brand new before inserting into it and (B) I'm grouping by the column designed to be the primary key?
For now, I'm solving the problem with IGNORE_DUP_KEY = ON, but I'd really like to know the root cause of the problem.
Here is what I'm actually seeing in my SSMS window. There is nothing more and nothing less:
##Version is:
Microsoft SQL Server 2008 (SP3) - 10.0.5538.0 (X64)
Apr 3 2015 14:50:02
Copyright (c) 1988-2008 Microsoft Corporation
Standard Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)
Execution Plan:
Here is an example of what it looks like when it runs fine. Here I'm using READ COMMITTED, but it doesn't matter b/c I get the error no matter whether I read it committed or uncommited.
Here is another example of it failing, this time w/ READ COMMITTED.
Also:
I get the same error whether I'm populating a temp table or a
persistent table.
When I add option (maxdop 1) to the end of the insert it seems to fail every time, though I can't be exhaustively sure of that b/c I can't run it for infinity. But it seems to be the case.
Here is the definition of the price table. Table has 25M rows. 108,529 updates in the last hour.
CREATE TABLE [dbo].[Price](
[idPrice] [int] IDENTITY(1,1) NOT NULL,
[idIsbn] [int] NOT NULL,
[idMarketplace] [int] NOT NULL,
[lowestPrice] [smallmoney] NULL,
[offers] [smallint] NULL,
[priceDate] [smalldatetime] NOT NULL,
[priceChangedDate] [smalldatetime] NULL,
CONSTRAINT [pk_Price] PRIMARY KEY CLUSTERED
(
[idPrice] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
CONSTRAINT [uc_idIsbn_idMarketplace] UNIQUE NONCLUSTERED
(
[idIsbn] ASC,
[idMarketplace] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
And the two non-clustered indexes:
CREATE NONCLUSTERED INDEX [IX_Price_idMarketplace_INC_idIsbn_lowestPrice_priceDate] ON [dbo].[Price]
(
[idMarketplace] ASC
)
INCLUDE ( [idIsbn],
[lowestPrice],
[priceDate]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_Price_idMarketplace_priceChangedDate_INC_idIsbn_lowestPrice] ON [dbo].[Price]
(
[idMarketplace] ASC,
[priceChangedDate] ASC
)
INCLUDE ( [idIsbn],
[lowestPrice]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
You hadn't supplied your table structure.
This is a repro with some assumed details that causes the problem at read committed (NB: now you have supplied the definition I can see in your case updates to the priceChangedDate column will move rows around in the IX_Price_idMarketplace_priceChangedDate_INC_idIsbn_lowestPrice index if that's the one being seeked)
Connection 1 (Set up tables)
USE tempdb;
CREATE TABLE Price
(
SomeKey INT PRIMARY KEY CLUSTERED,
idIsbn INT IDENTITY UNIQUE,
idMarketplace INT DEFAULT 3100,
lowestPrice SMALLMONEY DEFAULT $1.23,
priceChangedDate DATETIME DEFAULT GETDATE()
);
CREATE NONCLUSTERED INDEX ix
ON Price(idMarketplace)
INCLUDE (idIsbn, lowestPrice, priceChangedDate);
INSERT INTO Price
(SomeKey)
SELECT number
FROM master..spt_values
WHERE number BETWEEN 1 AND 2000
AND type = 'P';
Connection 2
Concurrent DataModifications that move a row from the beginning of the seeked range (3100,1) to the end (3100,2001) and back again repeatedly.
USE tempdb;
WHILE 1=1
BEGIN
UPDATE Price SET SomeKey = 2001 WHERE SomeKey = 1
UPDATE Price SET SomeKey = 1 WHERE SomeKey = 2001
END
Connection 3 (Do the insert into a temp table with a unique constraint)
USE tempdb;
CREATE TABLE #P
(
idIsbn INT NOT NULL PRIMARY KEY,
price SMALLMONEY,
priceChangedDate DATETIME
);
WHILE 1 = 1
BEGIN
TRUNCATE TABLE #P
INSERT INTO #P
(idIsbn,
price,
priceChangedDate)
SELECT idIsbn,
MIN(lowestPrice),
MIN(priceChangedDate)
FROM Price p
WHERE p.idMarketplace = 3100
GROUP BY p.idIsbn
END
The plan has no aggregate as there is a unique constraint on idIsbn (a unique constraint on idIsbn,idMarketplace would also work) therefore the group by can be optimised out as there are no duplicate values.
But at read committed isolation level shared row locks are released as soon as the row is read. So it is possible for a row to move places and be read a second time by the same seek or scan.
The index ix doesn't explicitly include SomeKey as a secondary key column but as it is not declared unique SQL Server silently includes the clustering key behind the scenes, hence updating that column value can move rows around in it.
I've tried searching for an answer to this but can't find one.
I have a CTE I use for SQL queries relating to 2 data tables in a database. The primary key of one table is a foreign key in the other and can appear numerous times in the 2nd table. I want to do a count of the number of times each foreign key appears in the second table, and list this as a total field in my search results along with details from the first table. As CTEs don't work in Access I've adjusted this to use a sub select in the join, but it still doesn't like it in access.
Here are the basic parts of the tables
CREATE TABLE [dbo].[Clients](
[ClientRef] [int] NOT NULL,
[Surname] [varchar](40) NULL,
[Forenames] [varchar](50) NULL,
[Title] [varchar](40) NULL,
CONSTRAINT [CLIE_ClientRef_PK] PRIMARY KEY CLUSTERED
(
[ClientRef] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[Policies](
[PolicyRef] [int] NOT NULL,
[ClientRef] [int] NULL,
CONSTRAINT [POLI_PolicyRef_PK] PRIMARY KEY CLUSTERED
(
[PolicyRef] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Here's my CTE
WITH CliPol (ClientRef, Plans) AS (SELECT ClientRef, COUNT(ClientRef) AS Plans FROM Policies GROUP BY ClientRef)
SELECT Clients.Surname, Clients.Forenames, Clients.Title, CliPol.Plans AS [No. of plans]
FROM Clients LEFT JOIN CliPol ON Clients.ClientRef = CliPol.ClientRef
ORDER BY Surname, Forenames;
And here's my adjusted query.
SELECT Clients.ClientRef, Clients.Surname, Clients.Forenames, Clients.Title , Plans.NoPlans
FROM Clients
LEFT JOIN
(SELECT ClientRef, COUNT(ClientRef) AS NoPlans FROM Policies GROUP BY ClientRef)
AS Plans ON Plans.ClientRef = Clients.ClientRef
ORDER BY Clients.Surname, Clients.Forenames
Unfortunately Access throws error #3131, "Syntax error in FROM clause", when I try to run that query.
Does anybody know how I make this work in Access?
One alternate approach would be to use the DCount() domain aggregate function
SELECT
Clients.ClientRef,
Clients.Surname,
Clients.Forenames,
Clients.Title ,
DCount("ClientRef", "Plans", "ClientRef=" & Clients.ClientRef) AS NoPlans
FROM Clients
ORDER BY Clients.Surname, Clients.Forenames
As part of a migration project, we have imported data from a JDE iSeries DB2 database. An SSIS package was created to create the destination tables and import data. The import went successfully.
Now comes the problem - The customer wants Primary Keys created in the destination DB (SQL 2008 R2). The problem table in this case, would be one table that has 104 columns and 7.5 million rows of data. The PK required for this table is composite and has 7 columns.
We are considering this :
BEGIN TRANSACTION
GO
ALTER TABLE [dbo].[F0911] ADD CONSTRAINT [F0911_PK] PRIMARY KEY CLUSTERED
(
[GLDCT] ASC,
[GLDOC] ASC,
[GLKCO] ASC,
[GLDGJ] ASC,
[GLJELN] ASC,
[GLLT] ASC,
[GLEXTL] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
COMMIT
or this:
-- Rename existing tables
sp_RENAME '[F0911]' , '[F0911_old]'
GO
-- Create new table
SELECT * INTO F0911 FROM F0911_old WHERE 1=0
GO
--Create PK constraints
ALTER TABLE [dbo].[F0911] ADD CONSTRAINT [F0911_PK] PRIMARY KEY CLUSTERED
(
[GLDCT] ASC,
[GLDOC] ASC,
[GLKCO] ASC,
[GLDGJ] ASC,
[GLJELN] ASC,
[GLLT] ASC,
[GLEXTL] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
--Insert data into new tables
INSERT INTO F0911
SELECT * FROM F0911_old
GO
-- Drop old tables
DROP TABLE F0911_old
GO
Which would be a more efficient approach, performance wise? I have a gut feeling that both are the same and even the first approach does the same thing as the second one does, implicitly. Is this understanding correct?
Please note that all these columns already exist in the table and we cannot modify the table definition.
Thanks,
Raj
They're the same. The effect of creating a clustered index is to arrange the pages which will happen in both cases. For non-clustered indexes it will help to disable the index and then turn it back on and rebuilding it.
I think the first approach is right, but I don't understand the reason of BEGIN Transaction and END transaction. I don't think Transaction keyword is necessary because you are not modifying data of the table. Transaction is used where we have to lock the data and we are modifying real time data so the old data is not used.
Should we use a flag for soft deletes, or a separate joiner table? Which is more efficient? Database is SQL Server.
Background Information
A while back we had a DB consultant come in and look at our database schema. When we soft delete a record, we would update an IsDeleted flag on the appropriate table(s). It was suggested that instead of using a flag, store the deleted records in a seperate table and use a join as that would be better. I've put that suggestion to the test, but at least on the surface, the extra table and join looks to be more expensive then using a flag.
Initial Testing
I've set up this test.
Two tables, Example and DeletedExample. I added a nonclustered index on the IsDeleted column.
I did three tests, loading a million records with the following deleted/non deleted ratios:
Deleted/NonDeleted
50/50
10/90
1/99
Results - 50/50
Results - 10/90
Results - 1/99
Database Scripts, For Reference, Example, DeletedExample, and Index for Example.IsDeleted
CREATE TABLE [dbo].[Example](
[ID] [int] NOT NULL,
[Column1] [nvarchar](50) NULL,
[IsDeleted] [bit] NOT NULL,
CONSTRAINT [PK_Example] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Example] ADD CONSTRAINT [DF_Example_IsDeleted] DEFAULT ((0)) FOR [IsDeleted]
GO
CREATE TABLE [dbo].[DeletedExample](
[ID] [int] NOT NULL,
CONSTRAINT [PK_DeletedExample] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[DeletedExample] WITH CHECK ADD CONSTRAINT [FK_DeletedExample_Example] FOREIGN KEY([ID])
REFERENCES [dbo].[Example] ([ID])
GO
ALTER TABLE [dbo].[DeletedExample] CHECK CONSTRAINT [FK_DeletedExample_Example]
GO
CREATE NONCLUSTERED INDEX [IX_IsDeleted] ON [dbo].[Example]
(
[IsDeleted] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
The numbers you have seem to indicate that my initial impression was correct: if your most common query against this database is to filter on IsDeleted = 0, then performance will be better with a simple bit flag, especially if you make wise use of indexes.
If you often query for deleted and undeleted data separately, then you could see a performance gain by having a table for deleted items and another for undeleted items, with identical fields. But denormalizing your data like this is rarely a good idea, as it will most often cost you far more in code maintenance costs than it will gain you in performance increases.
I'm not the SQL expert but in my opinion,it all depends on the usage frequency of the database. If the database is accessed by the large number of users and needs to be efficient then usage of a seperate isDeleted table will be good. The better option would be using a flag during the production time and as a part of daily/weekly/monthly maintanace you may move all the soft deleted records to the isDeleted table and clear the production table of soft deleted records. The mixture of both option will be good a good one.
Everyday I import 2,000,000 rows from some text files using BULK INSERT into SQL Server 2008 and then I do some post-processing to update the records.
I have some indexes on the table to execute the post-process as fast as possible and in normal situation, the post-processing script takes about 40 seconds to run.
But sometimes (I don't know when) the post-processing does not work. In the situation I've mentioned, it is not done after an hour! After rebuilding indexes, everything is fine and normal.
What should I do to prevent the problem to be happened?
Right now, I have a nightly job to rebuild all indexes. Why the index fragmentation grows up to 90%?
Update:
Here is my table which I import text file into:
CREATE TABLE [dbo].[My_Transactions](
[My_TransactionId] [bigint] NOT NULL,
[FileId] [int] NOT NULL,
[RowNo] [int] NOT NULL,
[TransactionTypeId] [smallint] NOT NULL,
[TransactionDate] [datetime] NOT NULL,
[TransactionNumber] [dbo].[TransactionNumber] NOT NULL,
[CardNumber] [dbo].[CardNumber] NULL,
[AccountNumber] [dbo].[CardNumber] NULL,
[BankCardTypeId] [smallint] NOT NULL,
[AcqBankId] [smallint] NOT NULL,
[DeviceNumber] [dbo].[DeviceNumber] NOT NULL,
[Amount] [dbo].[Amount] NOT NULL,
[DeviceTypeId] [smallint] NOT NULL,
[TransactionFee] [dbo].[Amount] NOT NULL,
[AcqSwitchId] [tinyint] NOT NULL
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [_dta_index_Jam_Transactions_8_1290487676__K1_K4_K12_K6_K11_5] ON [dbo].[Jam_Transactions]
(
[Jam_TransactionId] ASC,
[TransactionTypeId] ASC,
[Amount] ASC,
[TransactionNumber] ASC,
[DeviceNumber] ASC
)
INCLUDE ( [TransactionDate]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [_dta_index_Jam_Transactions_8_1290487676__K12_K6_K11_K1_5] ON [dbo].[Jam_Transactions]
(
[Amount] ASC,
[TransactionNumber] ASC,
[DeviceNumber] ASC,
[Jam_TransactionId] ASC
)
INCLUDE ( [TransactionDate]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_Jam_Transactions] ON [dbo].[Jam_Transactions]
(
[Jam_TransactionId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Have you tried just refreshing the statistics after such a large insert:
UPDATE STATISTICS my_table
My experience with large bulk inserts is that the statistics get all mangled up and need a refreshed afterwards, it's also much faster than running a REINDEX or index REORDER.
Another option is to look into padding the index, you likely have no padding fill factor on your indexes meaning that if your index is:
A, B, D, E, F
and you insert a value with a CardNumber of C, then your index will look like:
A, B, D, E, F, C
and hence be ~20% fragmented, if you instead specify a fill factor for your index of say 15% we would see it look like roughly:
A, B, D, _, E, F
(Note the internal the empty space is put roughly in the middle point of the fillfactor % not at the end)
So that when you insert the C value it is closer to being correct, but it actually sees that the D is just swapped with the C and usually moves the D at that point.
Beyond that, are you sure that the fragmentation is actually the problem, as part of reindexing the table is read and loaded entirely into memory (provided it fits) and thus any query you run on it will be very fast.
Instead of including this table in the nightly job, why don't you make index maintenance (on this table specifically) part of the nightly import job, between BULK INSERT and whatever 'post-processing' is?
We don't have enough information to know why the index fragmentation grows that quickly. Which index(es)? How many indexes are there? What is the order of the data in the file?
You may also consider using the ORDER option in the BULK INSERT statement to change the way the data is inserted. It may make the load take longer but it should reduce the need to reorganize. Again depending on the order of the source data and the index(es) that become fragmented.
Finally, what is the impact of rebuilding/not rebuilding or reorganizing/not reorganizing the indexes? Have you tried both? Perhaps it makes the post-processing run quicker if you rebuild, but perhaps only a defragment is necessary. And while it may make the post-processing quicker, what about the queries that are run against the table later in the day? Have you done any metrics against those to see if they speed up or slow down depending on what you do at night?
Does your main table keep growing by 2 million rows per day or is there a lot of deleting taking place as well? Could you bulk insert into a temporary import table and do your processing prior to inserting into the main table? You can always use hints to force your queries to use certain indices:
SELECT *
FROM your_table_name WITH (INDEX(your_index_name))
WHERE your_column_name = 5
I would try taking the index offline before a mass row insertion and the bring it back online after the mass row insertion. Much, much more faster as compared to re-indexing, or performing a drop and create index..... difference is that the index is there data is being stored but the index is currently not being used, "Offline" until it is brought back "Online". I have a 1.5 million row insertion process and had a problem with one of my non clustered indexes fragmenting which was causing poor performance. Went form 99% fragmentation to .14% using the MSSQL Offline Online Index option....
Code sample:
ALTER INDEX idx_a ON dbo.tbl_A
REBUILD WITH (ONLINE = OFF);
Toggle between OFF and ON and you are good to go....