SQL type IGNORE_DUP_KEY - sql

CREATE TYPE [dbo].[IdList] AS TABLE ([Id] [int] NULL)
GO
How can I insert the same value multiple times in a type?
Guess it is something like IGNORE_DUP_KEY, but I can't seem to get to work

If there is no key or index on the column (as there isn't, in the statement you've given) then there already is no restriction on inserting the same value multiple times in a table.
DECLARE #i IdList
INSERT #i VALUES (1), (1), (1)
will work just fine. If you want to have a unique index with the IGNORE_DUP_KEY option so inserts will be discarded if the value is already there, rather than producing a constraint violation, you can do so by including a unique index with that option in the declaration:
CREATE TYPE [dbo].[IdList] AS TABLE (
[Id] [int] NULL,
INDEX IX_IdList_Id UNIQUE(ID) WITH (IGNORE_DUP_KEY = ON)
);
Or with a primary key (for non-nullable columns):
CREATE TYPE [dbo].[IdList] AS TABLE ([Id] [int] PRIMARY KEY WITH (IGNORE_DUP_KEY = ON));
Be careful with this, because silently discarding duplicate values can be a real good way to mask essential problems in your processing. SQL Server does produce the informational message "Duplicate key was ignored", but that message is itself easy to ignore (and gives no details on what key(s)).

Jeroen Mostert did tell me just to remove the key and i did ant did work

Related

Why does insert into table with primary key/identity column generate "Column name or number of supplied values does not match table definition" error

Table has a primary key/identity column with seed/increment of 1/1. When I try to insert a record into the table while omitting the primary key column because SQL should automatically assign that column a value, I get the following error: "Column name or number of supplied values does not match table definition."
I tried inserting a record while omitting the primary key/identity field.
I tried inserting a record with an explicit primary key/identity value and received the following error: "The user did not have permission to write to the column."
I tried setting IDENTITY_INSERT to ON and received the following error: "Cannot find the object "dbo.temp" because it does not exist or you do not have permissions."
CREATE TABLE [dbo].[temp](
[ProjectNumber] [INT] IDENTITY(1,1) NOT NULL,
[ServiceCenterID] [INT] NULL,
CONSTRAINT [PK_temp] PRIMARY KEY CLUSTERED
(
[ProjectNumber] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 100) ON [PRIMARY]
) ON [PRIMARY]
INSERT dbo.temp
SELECT 1 [ServiceCenterID]
I expect a record to be inserted into the table with the primary key/identity column (projectNumber) automatically assigned a value of 1. Instead I get the error "Column name or number of supplied values does not match table definition." even though projectNumber is a IDENTITY column
If you want to assign the ProjectNumber, then you need to bring in two columns.
INSERT dbo.temp (ProjectNumber, ServiceCenterID)
SELECT 1, 1 as [ServiceCenterID];
If you want it set automatically, then:
INSERT dbo.temp (ServiceCenterID)
SELECT 1;
The key idea in both cases is to list the columns explicitly. That way, you are less likely to make errors on INSERTs. And, if you do, they should be easier to debug.

I have a GUID Clustered primary key - Is there a way I can optimize or unfragment a table that might be fragmented?

Here's the code I have. The table actually has 20 more columns but I am just showing the first few:
CREATE TABLE [dbo].[Phrase]
(
[PhraseId] [uniqueidentifier] NOT NULL,
[PhraseNum] [int] NULL
[English] [nvarchar](250) NOT NULL,
PRIMARY KEY CLUSTERED ([PhraseId] ASC)
) ON [PRIMARY]
GO
From what I remember I read
Fragmentation and GUID clustered key
that it was good to have a GUID for the primary key but now it's been suggested it's not a good idea as data has to be re-ordered for each insert -- causing fragmentation.
Can anyone comment on this. Now my table has already been created is there a way to unfragment it? Also how can I stop this problem getting worse. Can I modify an existing table add NEWSEQUENTIALID?
Thats true ,NEWSEQUENTIALID helps to completely fill the data and index pages.
But NEWSEQUENTIALID datasize is 4 times than int.So 4 times more page will be require than int.
declare #t table(col int
,col2 uniqueidentifier DEFAULT NEWSEQUENTIALID())
insert into #t (col) values(1),(2)
select DATALENGTH(col2),DATALENGTH(col) from #t
Suppose x data page is require in case of int to hold 100 rows
In case of NEWSEQUENTIALID 4x data page will be require to hold 100 rows.
Therefore query will read more page to fetch same number of records.
So ,if you can alter table then you can add int identity column and make it PK+CI.You can drop or not [uniqueidentifier] as per your requirement or need.
Looks like this is dup to:
INT vs Unique-Identifier for ID field in database
But here's a rehash for your issue:
Rather than a guid and depending on your table depth, int or big int would be better choices, both from storage and optimization vantages. You might also consider defining the field as "int identity not null" to further help population.
GUIDs have a considerable storage impact, due to their length.
CREATE TABLE [dbo].[Phrase]
(
[PhraseId] [int] identity NOT NULL
CONSTRAINT [PK_Phrase_PhraseId] PRIMARY KEY,
[PhraseNum] [int] NULL
[English] [nvarchar](250) NOT NULL,
....
) ON [PRIMARY]
GO

Clustered index trouble

In our production system (SQL Server 2008 / R2) there is a table in which generated documents are stored.
The documents have a reference (varchar) and a sequence_nr (int). The document may be generated multiple times and each iteration gets saved in this table incrementing the sequence number. Additionally each record has a data column (varbinary) and a timestamp as well as a user tag.
The only reason to query this table is for auditing purposes later on and during inserts.
The primary key for the table is clustered over the reference and sequence_nr columns.
As you can probably guess generation of documents and thus the data in the table (since a document can be generated again at a later time) does not grow in order.
I realized this after inserts in the table started timing out.
The inserts are performed with a stored procedure. The stored procedure determines the current max sequence_nr for the given reference and inserts the new row with the next sequence_nr.
I am fairly sure a poor choice of clustered index is causing the timeout problems, since records will be inserted for already existing references, only with a different sequence_nr and thus may end up anywhere in the record collection, but most likely not at the end.
On to my question: would it be better to go for a non-clustered index as primary key or would it be better to introduce an identity column, make it a clustered primary key and keep an index for the combination of reference and sequence_nr?
Knowing that for the time being (and not at all as far as we can foresee) there is no need to query this table intensively, except for the case where a new sequence_nr must be determined.
Edit in answer to questions:
Tbh, I'm not sure about the timeout in the production environment. I do know that new documents get added in parallel running processes.
Table:
CREATE TABLE [dbo].[tbl_document] (
[reference] VARCHAR(50) NOT NULL,
[sequence_nr] INT NOT NULL,
[creation_date] DATETIME2 NOT NULL,
[creation_user] NVARCHAR (50) NOT NULL,
[document_data] VARBINARY(MAX) NOT NULL
);
Primary Key:
ALTER TABLE [dbo].[tbl_document]
ADD CONSTRAINT [PK_tbl_document] PRIMARY KEY CLUSTERED ([reference] ASC, [sequence_nr] ASC)
WITH (ALLOW_PAGE_LOCKS = ON, ALLOW_ROW_LOCKS = ON, PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF, STATISTICS_NORECOMPUTE = OFF);
Stored procedure:
CREATE PROCEDURE [dbo].[usp_save_document] #reference NVARCHAR (50),
#sequence_nr INT OUTPUT,
#creation_date DATETIME2,
#creation_user NVARCHAR(50),
#document_data VARBINARY(max)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #current_sequence_nr INT
SELECT #current_sequence_nr = max(sequence_nr)
FROM [dbo].[tbl_document]
WHERE [reference] = #reference
IF #current_sequence_nr IS NULL
BEGIN
SELECT #sequence_nr = 1
END
ELSE
BEGIN
SELECT #sequence_nr = #current_sequence_nr + 1
END
INSERT INTO [dbo].[tbl_document]
([reference],
[sequence_nr],
[creation_date],
[creation_user],
[document_data])
VALUES (#reference,
#sequence_nr,
#creation_date,
#creation_user,
#document_data)
END
Hope that helps.
I would go for the setting the PK not clustered, since:
keeping a b-tree balanced when the key has varchar makes the each leaf much bigger.
you for what you say, you aren't scanning this table for many rows at a time
Since a clustered index physically reorders the records of the table to match the index order, it is only useful if you want to read out several consecutive records in that order because then the whole records can be read by doing a sequential read on the disk.
If you are only using data that is present in the index, there is no gain in make it clustered, because the index in itself (clustered or not) is kept separate from the data and in order.
So for your specific case a non-clustered index is the right way to go. Inserts won't need to reorder the data (only the index) and finding a new sequence_nr can be fulfill by looking at the index alone.

IGNORE_DUP_KEY in Sql Server 2000 with composite primary key

I'm trying to create a table in SQL Server 2000, that has a composite primary key with IGNORE_DUP_KEY set to ON.
I've tried looking for this option in SQL Server Management Studio Express but I couldn't find it so now I'm down to creating the table programatically. Every SQL command I found on Google or Stack Overflow gives me an error:
Incorrect syntax near '('.
The table should have 4 columns (A,B,C,D) all decimal(18) and I need the primary key on A,B,C.
I would appreciate if someone could post an example CREATE command.
create table MyTable2 (
[a] decimal(18,2) not null,
[b] decimal(18,2) not null,
[c] decimal(18,2) not null,
[d] decimal(18,2),
CONSTRAINT myPK PRIMARY KEY (a,b,c)
)
CREATE UNIQUE INDEX MyUniqueIgnoringDups
ON MyTable2 (a,b,c)
WITH IGNORE_DUP_KEY --SQL 2000 syntax
--WITH(IGNORE_DUP_KEY = On) --SQL 2005+ syntax
--insert some data to test.
insert into mytable2 (a,b,c,d) values (1,2,3,4);--succeeds; inserts properly
insert into mytable2 (a,b,c,d) values (1,2,3,5);--insert fails, no err is raised.
-- "Duplicate key was ignored. (0 row(s) affected)"
For anyone interested, here's an explanation of what's happening from Erland Sommarskog on the MSDN forums:
When IGNORE_DUP_KEY is OFF, a duplicate key value causes an error and the entire statement is rolled back. That is, if the statement attempted to insert multiple rows, no rows are inserted.
When IGNORE_DUP_KEY is ON, a duplicate key value is simply ignored. The statement completes successfully and any other rows are inserted.

Increasing performance on a logging table in SQL Server 2005

I have a "history" table where I log each request into a Web Handler on our web site. Here is the table definition:
/****** Object: Table [dbo].[HistoryRequest] Script Date: 10/09/2009 17:18:02 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[HistoryRequest](
[HistoryRequestID] [uniqueidentifier] NOT NULL,
[CampaignID] [int] NOT NULL,
[UrlReferrer] [nvarchar](512) NOT NULL,
[UserAgent] [nvarchar](512) NOT NULL,
[UserHostAddress] [nvarchar](15) NOT NULL,
[UserHostName] [nvarchar](512) NOT NULL,
[HttpBrowserCapabilities] [xml] NOT NULL,
[Created] [datetime] NOT NULL,
[CreatedBy] [nvarchar](100) NOT NULL,
[Updated] [datetime] NULL,
[UpdatedBy] [nvarchar](100) NULL,
CONSTRAINT [PK_HistoryRequest] PRIMARY KEY CLUSTERED
(
[HistoryRequestID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[HistoryRequest] WITH CHECK ADD CONSTRAINT [FK_HistoryRequest_Campaign] FOREIGN KEY([CampaignID])
REFERENCES [dbo].[Campaign] ([CampaignId])
GO
ALTER TABLE [dbo].[HistoryRequest] CHECK CONSTRAINT [FK_HistoryRequest_Campaign]
GO
37 seconds for 1050 rows on this statement:
SELECT *
FROM HistoryRequest AS hr
WHERE Created > '10/9/2009'
ORDER BY Created DESC
Does anyone have anysuggestions for speeding this up? I have a Clustered Index on the PK and a regular Index on the CREATED column. I tried a Unique Index and it barfed complaining there is a duplicate entry somewhere - which can be expected.
Any insights are welcome!
You are requesting all columns (*) over a non-covering index (created). On a large data set you are guaranteed to hit the Index Tipping Point where the clustered index scan is more efficient than an nonclustered index range seek and bookmark lookup.
Do you need * always? If yes, and if the typical access pattern is like this, then you must organize the table accordingly and make Created the leftmost clustered key.
If not, then consider changing your query to a coverable query, eg. select only HistoryRequestID and Created, which are covered by the non clustered index. If more fields are needed, add them as included columns to the non-clustered index, but take into account that this will add extra strorage space and IO log write time.
Hey, I've seen some odd behavior when pulling XML columns in large sets. Try putting your index on Created back, then specify the columns in your select statement; but omit the XML. See how that affects the return time for results.
For a log table, you probably don't need a uniqueidentifier column. You're not likely to query on it either, so it's not a good candidate for a clustered index. Your sample query is on "Created", yet there's no index on it. If you query frequently on ranges of "Created" values then it would be a good candidate for clustering even though it's not necessarily unique.
OTOH, the foreign key suggests frequent querying by Campaign, in which case having the clustering done by that column could make sense, and would also probably do a better job of scattering the inserted keys in the indexes - both the surrogate key and the timestamp would add records in sequential order, which is net more work over time for insertions because the node sectors are filled less randomly.
If it's just a log table, why does it have update audit columns? It would normally be write-only.
Rebuild indexes. Use WITH (NOLOCK) clause after the table names where appropriate, this probably applies if you want to run long(ish) running queries against table that are heavily used in a live environment (such as a log file). It basically means your query migth miss some of teh very latest records but you also aren't holding a lock open on the table - which creates additional overhead.