create primary key on existing table with data - sql

As part of a migration project, we have imported data from a JDE iSeries DB2 database. An SSIS package was created to create the destination tables and import data. The import went successfully.
Now comes the problem - The customer wants Primary Keys created in the destination DB (SQL 2008 R2). The problem table in this case, would be one table that has 104 columns and 7.5 million rows of data. The PK required for this table is composite and has 7 columns.
We are considering this :
BEGIN TRANSACTION
GO
ALTER TABLE [dbo].[F0911] ADD CONSTRAINT [F0911_PK] PRIMARY KEY CLUSTERED
(
[GLDCT] ASC,
[GLDOC] ASC,
[GLKCO] ASC,
[GLDGJ] ASC,
[GLJELN] ASC,
[GLLT] ASC,
[GLEXTL] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
COMMIT
or this:
-- Rename existing tables
sp_RENAME '[F0911]' , '[F0911_old]'
GO
-- Create new table
SELECT * INTO F0911 FROM F0911_old WHERE 1=0
GO
--Create PK constraints
ALTER TABLE [dbo].[F0911] ADD CONSTRAINT [F0911_PK] PRIMARY KEY CLUSTERED
(
[GLDCT] ASC,
[GLDOC] ASC,
[GLKCO] ASC,
[GLDGJ] ASC,
[GLJELN] ASC,
[GLLT] ASC,
[GLEXTL] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
--Insert data into new tables
INSERT INTO F0911
SELECT * FROM F0911_old
GO
-- Drop old tables
DROP TABLE F0911_old
GO
Which would be a more efficient approach, performance wise? I have a gut feeling that both are the same and even the first approach does the same thing as the second one does, implicitly. Is this understanding correct?
Please note that all these columns already exist in the table and we cannot modify the table definition.
Thanks,
Raj

They're the same. The effect of creating a clustered index is to arrange the pages which will happen in both cases. For non-clustered indexes it will help to disable the index and then turn it back on and rebuilding it.

I think the first approach is right, but I don't understand the reason of BEGIN Transaction and END transaction. I don't think Transaction keyword is necessary because you are not modifying data of the table. Transaction is used where we have to lock the data and we are modifying real time data so the old data is not used.

Related

Issue with re-create index/constraint after dropping it

I am using SQL Server 2008 R2.
I encountered this issue when re-create index.
As I need to alter column, so I drop constraint/index first and create back my constraint/index.
However, it shows error message saying
The operation failed because an index or statistics with name 'ABC' already exists on table 'test_table'
I wonder why would this error message shown? Since I have drop constraint
I wrote this to drop index
DROP INDEX [ABC] ON [dbo].[test_table] WITH ( ONLINE = OFF )
I then re-create index
CREATE NONCLUSTERED INDEX [ABC] ON [dbo].[test_table]
(
[col_1] ASC,
[col_2] ASC,
[col_3] ASC
)
INCLUDE ( [col_4],
[col_5]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Anyone has any idea what's wrong here?

SQL Server ALTER COLUMN NOT NULL has no effect

The designers of table SOME_TABLE did not define a primary key, and worse, they set one of the columns that could define the primary key as NULLable (the others are OK).
The data for SOME_TABLE.PrinterPos does not contain any NULL values.
I am writing an upgrade script to apply to ~50 databases.
The following code is failing:
ALTER TABLE dbo.SOME_TABLE
ALTER COLUMN PrinterPos smallint NOT NULL;
ALTER TABLE dbo.SOME_TABLE
ADD CONSTRAINT PK_SOME_TABLE
PRIMARY KEY CLUSTERED (SOME_TABLE_ID ASC, Store_ID ASC, PrinterPos ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,
IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY];
I get the message
Cannot define PRIMARY KEY constraint on nullable column in table 'SOME_TABLE'.
It looks line the first command is being totally ignored. Although there is no message to indicate this.
To put it in it's own batch, I have tried executing the first command using sp_executesql to no effect.
If I execute the first command in SQL Server Management Studio followed by the second then it executes OK.
I need to get this change fully automated. How can I get this to work via script?
Try adding a GO keyword between the two ALTER TABLE commands

SQL Conditional Unique Constraint With Where Clause Within Same Table

I have a table where I want to ensure that a combination of five columns remain unique within that table. For example:
ALTER TABLE [dbo].[MyTable]
ADD CONSTRAINT [UQ__MyTable.MFG.Model.Class.Depiction.Iteration]
UNIQUE NONCLUSTERED
(
[ManufacturerID] ASC,
[Model] ASC,
[BlockClassID] ASC,
[BlockDepictionID] ASC,
[BlockIterationID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
ON [PRIMARY]
GO
I want to exclude combinations where a sixth separate column has a particular value. For example, I only want to enforce this above constraint when the column [Flag] = 0 and exclude enforcement when the column [Flag] = 1 .
As workaround, You can get the proper ANSI behavior in SQL Server 2008 and above by creating a unique, filtered index.
CREATE UNIQUE NONCLUSTERED INDEX [IX__MyTable.MFG.Model.Class.Depiction.Iteration]
ON [dbo].[MyTable] ([ManufacturerID],[Model],[BlockClassID],[BlockDepictionID],[BlockIterationID])
WHERE [Flag] = 0;
TechNet article

Composite Keys SQL server

I have created a joining table for many-to-many relationship.
The table only has 2 cols in it, ticketid and groupid
typical data would be
groupid ticketid
20 56
20 87
20 96
24 13
24 87
25 5
My question is when creating the composite key should I have ticketid followed by groupid
CONSTRAINT [PK_ticketgroup] PRIMARY KEY CLUSTERED
(
[ticketid] ASC,
[groupid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Or the other way, groupid followed by ticketid
CONSTRAINT [PK_ticketgroup] PRIMARY KEY CLUSTERED
(
[groupid] ASC,
[ticketid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Would searching the index be quicker in option 1 as the ticketid's have more chance of being unique then the groupids and they would be at the start of the composite key? Or is this negligible?
The difference would most likely be negligible.
It is however recommended for SQL Server that the most selective column be placed fist. If a column with low selectivity is placed first, the Optimizer may determine that your index is not very selective and will choose to ignore it. See this sqlserverpedia.com Wiki Article for more information.
I would actually create two indexes. Given that ticket IDs are more likely to be unique, the clustered index would be GroupID,TicketID in that order. I would then create a non-clustered non-unique index on TicketID.
The reason being that if you wanted to query based only on group ID, they would logically be contiguous and there would be a block of them. The other index will give you fastest when only TicketID is specified.
I do think it would probably be negligible overall depending on how the data will be queried (i.e. if groupid and ticketid are always provided).

Soft Delete - Use IsDeleted flag or separate joiner table?

Should we use a flag for soft deletes, or a separate joiner table? Which is more efficient? Database is SQL Server.
Background Information
A while back we had a DB consultant come in and look at our database schema. When we soft delete a record, we would update an IsDeleted flag on the appropriate table(s). It was suggested that instead of using a flag, store the deleted records in a seperate table and use a join as that would be better. I've put that suggestion to the test, but at least on the surface, the extra table and join looks to be more expensive then using a flag.
Initial Testing
I've set up this test.
Two tables, Example and DeletedExample. I added a nonclustered index on the IsDeleted column.
I did three tests, loading a million records with the following deleted/non deleted ratios:
Deleted/NonDeleted
50/50
10/90
1/99
Results - 50/50
Results - 10/90
Results - 1/99
Database Scripts, For Reference, Example, DeletedExample, and Index for Example.IsDeleted
CREATE TABLE [dbo].[Example](
[ID] [int] NOT NULL,
[Column1] [nvarchar](50) NULL,
[IsDeleted] [bit] NOT NULL,
CONSTRAINT [PK_Example] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Example] ADD CONSTRAINT [DF_Example_IsDeleted] DEFAULT ((0)) FOR [IsDeleted]
GO
CREATE TABLE [dbo].[DeletedExample](
[ID] [int] NOT NULL,
CONSTRAINT [PK_DeletedExample] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[DeletedExample] WITH CHECK ADD CONSTRAINT [FK_DeletedExample_Example] FOREIGN KEY([ID])
REFERENCES [dbo].[Example] ([ID])
GO
ALTER TABLE [dbo].[DeletedExample] CHECK CONSTRAINT [FK_DeletedExample_Example]
GO
CREATE NONCLUSTERED INDEX [IX_IsDeleted] ON [dbo].[Example]
(
[IsDeleted] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
The numbers you have seem to indicate that my initial impression was correct: if your most common query against this database is to filter on IsDeleted = 0, then performance will be better with a simple bit flag, especially if you make wise use of indexes.
If you often query for deleted and undeleted data separately, then you could see a performance gain by having a table for deleted items and another for undeleted items, with identical fields. But denormalizing your data like this is rarely a good idea, as it will most often cost you far more in code maintenance costs than it will gain you in performance increases.
I'm not the SQL expert but in my opinion,it all depends on the usage frequency of the database. If the database is accessed by the large number of users and needs to be efficient then usage of a seperate isDeleted table will be good. The better option would be using a flag during the production time and as a part of daily/weekly/monthly maintanace you may move all the soft deleted records to the isDeleted table and clear the production table of soft deleted records. The mixture of both option will be good a good one.