Re-seeding a large sql table

Re-seeding a large sql table - sql

Using version:
Microsoft SQL Server 2008 R2 (SP3-OD) (KB3144114) - 10.50.6542.0 (Intel X86)
Feb 22 2016 18:12:09
Copyright (c) Microsoft Corporation
Standard Edition on Windows NT 5.2 <X86> (Build : )
I have a heavy table (135K rows), that I moved from another DB.
It transferred with the [id] column being a standard int column instead of it being the key & seed column.
When trying to edit that field to become an identity specification, with a seed value, its errors out and gives me this error:
Execution Timeout Expired.
The timeout period elapsed prior to completion of the operation...
I even tried deleting that column, to try recreate it later, but i get the same issue.
Thanks
UPDATE:
Table structure:
CREATE TABLE [dbo].[tblEmailsSent](
[id] [int] IDENTITY(1,1) NOT NULL, -- this is what it should be. currently its just an `[int] NOT NULL`
[Sent] [datetime] NULL,
[SentByUser] [nvarchar](50) NULL,
[ToEmail] [nvarchar](150) NULL,
[StudentID] [int] NULL,
[SubjectLine] [nvarchar](200) NULL,
[MessageContent] [nvarchar](max) NULL,
[ReadStatus] [bit] NULL,
[Folder] [nvarchar](50) NULL,
CONSTRAINT [PK_tblMessages] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

I think that your question is a duplicate of Adding an identity to an existing column. That question above has an answer that should be perfect for your situation. I'll reproduce its essential part here below.
But before that, let's clarify why you see the timeout error.
You are trying to add the IDENTITY property to existing column. And you are using SSMS GUI for it. A simple ALTER COLUMN statement can't do it and even if it could, SSMS generates a script that creates a new table, copies over the data into the new table, drops the old table and renames the new table to the old name. When you do this operation via SSMS GUI it runs its scripts with a predefined timeout of 30 seconds.
Of course, you can change this setting in SSMS and increase the timeout, but there is a much better way.
Simple/lazy way
Use SSMS GUI to change the column definition, but then instead of clicking "Save", click "Generate Change Script" in the table designer.
Then save this script to a file and review the generated T-SQL code that GUI runs behind the scene.
You'll see that it creates a temp table with the required schema, copies data over, re-creates foreign keys and indexes, drops the old table and renames the new table.
The script itself is usually correct, but pay close attention to transactions in it. For some reason SSMS often doesn't use a single transaction for the whole operation, but several transactions. I'd recommend to manually review the script and make sure that there is only one BEGIN TRANSACTION at the top and one COMMIT in the end. You don't want to end up with a half-done operation with, say, a table where all indexes and foreign keys were dropped.
If it is a one-off operation, it could be enough for you. Your table is only 2.4GB, so it may take few minutes, but it should not be hours.
If you run the T-SQL script yourself in SSMS, then by default there is no timeout. You can stop it yourself if it takes too long.
Smart and fast way to do it is described in details in this answer by Justin Grant.
The main idea is to use the ALTER TABLE...SWITCH statement to make the change only touching the metadata without touching each page of the table.
BEGIN TRANSACTION;
-- create a new table with required schema
CREATE TABLE [dbo].[NEW_tblEmailsSent](
[id] [int] IDENTITY(1,1) NOT NULL,
[Sent] [datetime] NULL,
[SentByUser] [nvarchar](50) NULL,
[ToEmail] [nvarchar](150) NULL,
[StudentID] [int] NULL,
[SubjectLine] [nvarchar](200) NULL,
[MessageContent] [nvarchar](max) NULL,
[ReadStatus] [bit] NULL,
[Folder] [nvarchar](50) NULL,
CONSTRAINT [PK_tblEmailsSent] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
-- switch the tables
ALTER TABLE [dbo].[tblEmailsSent] SWITCH TO [dbo].[NEW_tblEmailsSent];
-- drop the original (now empty) table
DROP TABLE [dbo].[tblEmailsSent];
-- rename new table to old table's name
EXEC sp_rename 'NEW_tblEmailsSent','tblEmailsSent';
COMMIT;
After the new table has IDENTITY property you normally should set the current identity value to the maximum of the actual values in your table. If you don't do it, new rows inserted into the table would start from 1.
One way to do it is to run DBCC CHECKIDENT after you switched the tables:
DBCC CHECKIDENT('dbo.tblEmailsSent')
Alternatively, you can specify the new seed in the table definition:
CREATE TABLE [dbo].[NEW_tblEmailsSent](
[id] [int] IDENTITY(<max value of id + 1>, 1) NOT NULL,

Related

Violation of PRIMARY KEY constraint 'xx'. Cannot insert duplicate key in object 'cc'. The duplicate key is (x)

Updated question for better understanding and because I found the solution, I was looking for:
My script goes like this:
CREATE TABLE [dbo].[MyTable]
(
[Columndata1] [nvarchar] (255) NOT NULL,
[Columndata2] [nvarchar] (max) NOT NULL,
[Columndata3] [nvarchar] (max) NOT NULL,
[ColumndataTime] [datetime] NOT NULL,
CONSTRAINT [PK_MyTable]
PRIMARY KEY CLUSTERED ([Columndata1] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[MyTable]
ADD CONSTRAINT [DF_MyTable_ColumndataTime] DEFAULT (getutcdate()) FOR [ColumndataTime]
GO
I am trying to do a workaround in case of duplicate PK, so that if it happens, it should ignore the request and if not create the table.
I guess I wasn't clear about it in my initial question.

I think you already have your answer, you are inserting duplicate values to the column you designated as your primary key, the error you are getting tells you this, and it's all particularly clear. I can see that you assumed something more "sinister" was happening, but it seems to be that this wasn't the case, and it wasn't a "race condition" or something more complex.
However, I thought it might be worth pointing out something that I see as a bit of a "red flag". Maybe this doesn't classify as an answer, but it does address some points in your original question, particularly when you start asking about the options in the "complicated" part of your CREATE TABLE script, and it's too long for a comment?
If you have a "default" out of the box installation of SQL Server then 90% of the statements in your CREATE TABLE script are simply redundant defaults.
I can run this script:
CREATE TABLE [dbo].[MyTable] (
[Columndata1] [nvarchar] (255) NOT NULL,
[Columndata2] [nvarchar] (max) NOT NULL,
[Columndata3] [nvarchar] (max) NOT NULL,
[ColumndataTime] [datetime] NOT NULL,
CONSTRAINT [PK_MyTable] PRIMARY KEY CLUSTERED ([Columndata1]));
Then I can generate the create script for the table, directly from SSMS, to get this:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[MyTable](
[Columndata1] [nvarchar](255) NOT NULL,
[Columndata2] [nvarchar](max) NOT NULL,
[Columndata3] [nvarchar](max) NOT NULL,
[ColumndataTime] [datetime] NOT NULL,
CONSTRAINT [PK_MyTable] PRIMARY KEY CLUSTERED
(
[Columndata1] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
That second script looks very similar to the one in your post now doesn't it?
Now there's nothing necessarily wrong with specifying all of these default options, and there are cases where it might end up to your advantage, but my personal preference (and the preference of everyone I have ever worked with) is to omit the default options, as they just make scripts harder to peer review, and are essentially "clutter". You can argue over whether it's worth specifying the ASC in the PRIMARY KEY section or not, some of this is assumed knowledge, and there's always the possibility that Microsoft might decide to change defaults in the future (and then the first script wouldn't generate what you wanted). However, taking a pragmatic view of how these things work, the chances of Microsoft changing these options in a future version are incredibly slim, as it would break so many databases out there being used in the wild.
Take this as you want, but I thought it was worth a stab at explaining this, as you seem to be a little fixated (maybe the wrong word :)) on the "long part" (your words) in your original query?

My solution for it was to use the WHERE NOT EXISTS statement in SQL greatly inspired from this answer: https://stackoverflow.com/a/3025332
INSERT INTO `table` (`value1`, `value2`)
SELECT 'stuff for value1', 'stuff for value2' FROM DUAL
WHERE NOT EXISTS (SELECT * FROM `table`
WHERE `value1`='stuff for value1' AND `value2`='stuff for value2' LIMIT 1)
This way I won't get duplicates.
Kudos to Richard Hansell for the explanation, making me move my focus from the "complicated" part.

cannot write to newly created table in SQL Azure

in our Azure SQL Service db we had a table App_Tracking that is/was used to track user actions. We needed to increase the size of the log buffer so I first copied over all the records to an archive table that was defined using this SQL statement
CREATE TABLE [dbo].[App_Tracking_Nov20_2015](
[ID] [int] IDENTITY(1,1) NOT NULL,
[UserID] [nvarchar](50) NOT NULL,
[App_Usage] [nvarchar](1024) NOT NULL,
[Timestamp] [datetime] NOT NULL )
Then using SQL Management Studio 2012 I recreated the original table using :Drop/Create script Generation:
USE [tblAdmin] GO
/****** Object: Table [dbo].[App_Tracking] Script Date: 11/21/2015 11:42:01 AM ******/
DROP TABLE [dbo].[App_Tracking] GO
/****** Object: Table [dbo].[App_Tracking] Script Date: 11/21/2015 11:42:01 AM ******/
SET ANSI_NULLS OFF
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[App_Tracking](
[ID] [int] IDENTITY(1,1) NOT NULL,
[UserID] [nvarchar](50) NOT NULL,
[App_Usage] [nvarchar](4000) NOT NULL,
[Timestamp] [datetime] NOT NULL,
CONSTRAINT [PrimaryKey_ 7c88841f-aaaa-bbbb-cccc- c26fe6a5720e] PRIMARY KEY CLUSTERED (
[ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) )
GO
this is the automated drop/create that SMS2012 creates for you
I then updated statistics on App_Admin using EXEC sp_updatestats
The gotcha is that I can no longer programattically add records to this table.
If I open App_Admin from manage.windowsazure.net and "open in Visual Studio" I can manually add a record to it. but if in SMS2012 I run the code
USE [tblAdmin] GO
UPDATE [dbo].[App_Tracking] SET
[UserID] = 'e146ba22-930c-4b22-ac3c-15da47722e75' ,
[App_Usage] = 'search search: Bad Keyword: asdfadsfs' ,
[Timestamp] = '2015-11-20 20:00:18.700'
GO
nothing gets updated but no error is thrown.
If programmatically I use
var adminContext = new App_AdminEntities();
string prunedAction = action.Length <= 4000 ? action : action.Trim().Substring (0, 4000); // insure we don't fault on overflow of too long a keyword list
var appTracking = new App_Tracking
{
UserID = userId,
PP_Usage = prunedAction,
Timestamp = DateTime.Now
};
try {
adminContext.App_Tracking.Add(APPTracking);
adminContext.SaveChanges();
adminContext.Dispose();
}
I get an error thrown on SaveChanges (which is the .net SQL db function) What did I do wrong

OK so I found the problem. it turns out I had not updated the EDMX file associated and thus the error was being thrown by internal entity validation - which is kindof hidden under the covers –

SQL Statement take long time to execute

I have a SQL Server database and having a table containing too many records. Before it was working fine but now when I run SQL Statement takes time to execute.
Sometime cause the SQL Database to use too much CPU.
This is the Query for the table.
CREATE TABLE [dbo].[tblPAnswer1](
[ID] [bigint] IDENTITY(1,1) NOT NULL,
[AttrID] [int] NULL,
[Kidato] [int] NULL,
[Wav] [int] NULL,
[Was] [int] NULL,
[ShuleID] [int] NULL,
[Mwaka] [int] NULL,
[Swali] [float] NULL,
[Wilaya] [int] NULL,
CONSTRAINT [PK_tblPAnswer1] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
And the following down is the sql stored procedure for the statement.
ALTER PROC [dbo].[uspGetPAnswer1](#ShuleID int, #Mwaka int, #Swali float, #Wilaya int)
as
SELECT ID,
AttrID,
Kidato,
Wav,
Was,
ShuleID,
Mwaka,
Swali,
Wilaya
FROM dbo.tblPAnswer1
WHERE [ShuleID] = #ShuleID
AND [Mwaka] = #Mwaka
AND [Swali] = #Swali
AND Wilaya = #Wilaya
What is wrong in my SQL Statement. Need help.

Just add an index on ShuleID, Mwaka, Swali and Wilaya columns. The order of columns in the index should depend on distribution of data (the columns with most diverse values in it should be the first in the index, and so on).
And if you need it super-fast, also include all the remaining columns used in the query, to have a covering index for this particular query.
EDIT: Probably should move the float col (Swali) from indexed to included columns.

Add an Index on the ID column and include ShuleID, Mwaka, Swali and Wilaya columns. That should help improve the speed of the query.
CREATE NONCLUSTERED INDEX IX_ID_ShuleID_Mwaka_Swali_Wilaya
ON tblPAnswer1 (ID)
INCLUDE (ShuleID, Mwaka, Swali, Wilaya);

What is the size of the table? You may need additional indices as you are not using the primary key to query the data. This article by Pinal Dave provides a script to identify missing indices.
http://blog.sqlauthority.com/2011/01/03/sql-server-2008-missing-index-script-download/
It provides a good starting point for index optimization.

Concurrent SQL Insert Are Blocking to a Table

I am using the SQL 2005 for an application. In my case, numbers of requests are being generated through different processes and inserting the Record to one Table. But when I examine the processes running in database by sp_who2 active procedure, I find the Inserts are being blocked by other Inserts Statements and causing the process slower. Is there any way to avoid the blocking / deadlocks in concurrent inserts to one table. Below is the structure of my table.
`CREATE TABLE [dbo].[Tbl_Meta_JS_Syn_Details](
[ID] [int] IDENTITY(1,1) NOT NULL,
[EID] [int] NULL,
[Syn_Points_ID] [int] NULL,
[Syn_ID] [int] NULL,
[Syn_Word_ID] [int] NULL,
[Created_Date_Time] [datetime] NULL CONSTRAINT [DF_Tbl_JS_Syn_Details_Created_Date_Time] DEFAULT (getdate()),
CONSTRAINT [PK_Tbl_JS_Syn_Details] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]`

There is always blocking if many processes are trying to insert into one table. However, some settings can be used to limit the amount of time.
What isolation level are you using? The default? http://technet.microsoft.com/en-us/library/ms173763.aspx
Can you include the following information into this post:
1 - How many processes are running (inserting) at the same time?
2 - What type of disk sub-system are you using? RAID 5 or just a simple disk.
3 - What version of SQL Server you are on?
4 - What are the growth options on the database?
5 - How full is the current database?
6 - Is instance file initialization on?
Given answers to the above questions, you can optimize the insert process.

SQL Generate Script Not Creating Database

I created a script of my database.
But When I run it, the script does not create the database. It skips the "Create db" statement, and only creates the tables (on the database I have selected at the moment, so not ideal....)
(query executes with no errors by the way.)
Why is this happening? why cant you create a database and edit the content in it in one go?
(I know you can check if the db exist first, but this shouldn't be happening from the start)
--My Script--
CREATE DATABASE [EthicsDB]
USE [EthicsDB]
go
CREATE TABLE [dbo].[TempEmployee](
[PersonnelNumber] [int] IDENTITY(1,1) NOT NULL,
[Name] [varchar](80) NULL,
[SurName] [varchar](80) NULL,
[ManagerEmail] [varchar](80) NULL,
[ManagerName] [varchar](80) NULL,
CONSTRAINT [PK_TempEmployee] PRIMARY KEY CLUSTERED
(
[PersonnelNumber] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO

You must use GO after CREATE DATABASE [EthicsDB].

Try this one -
USE [master]
GO
IF EXISTS (
SELECT 1 FROM sys.databases WHERE name = N'EthicsDB'
)
DROP DATABASE [EthicsDB]
GO
CREATE DATABASE [EthicsDB]
GO --<----
USE [EthicsDB]
GO
CREATE TABLE [dbo].[TempEmployee](
[PersonnelNumber] [int] IDENTITY(1,1) NOT NULL,
[Name] [varchar](80) NULL,
[SurName] [varchar](80) NULL,
[ManagerEmail] [varchar](80) NULL,
[ManagerName] [varchar](80) NULL,
CONSTRAINT [PK_TempEmployee] PRIMARY KEY CLUSTERED
(
[PersonnelNumber] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO

If you run the SQL as provided you get an error message on the line
USE [EthicsDB]
This occurs as when SQL Servers runs SQL commands via SQL CMD it process the SQL in batches.
As you have no GO statement after the Create database statement it maybe that SQL Server does not yet recognise that a new database Ethics has been created and thus when you attempt to use the database via USE [EthicsDB] the statement fails.
As your SQL Statements are not wrapped in a transaction and as you are not checking for errors then if SQL Server encounters an error it will raise the error but also continue to process the rest of the query.
In the query provided this leads to the new tables being created in the current database.
To correct the problem modify your query to
CREATE DATABASE [EthicsDB]
go
USE [EthicsDB]
go

You should probably wrap each action in a transaction block.
Also, when you are creating the table I generally do a check to see if it already exists first.
If you run only the create database, what happens?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Re-seeding a large sql table - sql

Related

Violation of PRIMARY KEY constraint 'xx'. Cannot insert duplicate key in object 'cc'. The duplicate key is (x)

cannot write to newly created table in SQL Azure

SQL Statement take long time to execute

Concurrent SQL Insert Are Blocking to a Table

SQL Generate Script Not Creating Database

Categories

Resources