SQL Constraint on column value depending on value of other column - sql

First, I have simple [SomeType] table, with columns [ID] and [Name].
Also I have [SomeTable] table, with fields like:
[ID],
[SomeTypeID] (FK),
[UserID] (FK),
[IsExpression]
Finally, I have to made on database layer a constraint that:
for concrete [SomeType] IDs (actually, for all but one),
for same UserID,
only one entry should have [IsExpression] equal to 1
(IsExpression is of BIT type)
I don't know if it's complex condition or not, but I have no idea how to write it. How would you implement such constraint?

You can do this with filtered index:
CREATE UNIQUE NONCLUSTERED INDEX [IDX_SomeTable] ON [dbo].[SomeTable]
(
[UserID] ASC
)
WHERE ([SomeTypeID] <> 1 AND [IsExpression] = 1)
or:
CREATE UNIQUE NONCLUSTERED INDEX [IDX_SomeTable] ON [dbo].[SomeTable]
(
[UserID] ASC,
[SomeTypeID] ASC
)
WHERE ([SomeTypeID] <> 1 AND [IsExpression] = 1)
Depends on what you are trying to achieve. Only one [IsExpression] = 1 within one user without consideration of [SomeTypeID] or you want only one [IsExpression] = 1 within one user and one [SomeTypeID].

Related

SQL Server Indexing and Composite Keys

Given the following:
-- This table will have roughly 14 million records
CREATE TABLE IdMappings
(
Id int IDENTITY(1,1) NOT NULL,
OldId int NOT NULL,
NewId int NOT NULL,
RecordType varchar(80) NOT NULL, -- 15 distinct values, will never increase
Processed bit NOT NULL DEFAULT 0,
CONSTRAINT pk_IdMappings
PRIMARY KEY CLUSTERED (Id ASC)
)
CREATE UNIQUE INDEX ux_IdMappings_OldId ON IdMappings (OldId);
CREATE UNIQUE INDEX ux_IdMappings_NewId ON IdMappings (NewId);
and this is the most common query run against the table:
WHILE #firstBatchId <= #maxBatchId
BEGIN
-- the result of this is used to insert into another table:
SELECT
NewId, -- and lots of non-indexed columns from SOME_TABLE
FROM
IdMappings map
INNER JOIN
SOME_TABLE foo ON foo.Id = map.OldId
WHERE
map.Id BETWEEN #firstBatchId AND #lastBatchId
AND map.RecordType = #someRecordType
AND map.Processed = 0
-- We only really need this in case the user kills the binary or SQL Server service:
UPDATE IdMappings
SET Processed = 1
WHERE map.Id BETWEEN #firstBatchId AND #lastBatchId
AND map.RecordType = #someRecordType
SET #firstBatchId += 4999
SET #lastBatchId += 4999
END
What are the best indices to add? I figure Processed isn't worth indexing since it only has 2 values. Is it worth indexing RecordType since there are only about 15 distinct values? How many distinct values will a column likely have before we consider indexing it?
Is there any advantage in a composite key if some of the fields are in the WHERE and some are in a JOIN's ON condition? For example:
CREATE INDEX ix_IdMappings_RecordType_OldId
ON IdMappings (RecordType, OldId)
... if I wanted both these fields indexed (I'm not saying I do), does this composite key gain any advantage since both columns don't appear together in the same WHERE or same ON?
Insert time into IdMappings isn't really an issue. After we insert all records into the table, we don't need to do so again for months if ever.

SQL Server two columns in index but query on only one

In legacy code I found an index as follows:
CREATE CLUSTERED INDEX ix_MyTable_foo ON MyTable
(
id ASC,
name ASC
)
If I understand correctly, this index would be useful for querying on column id alone, or id and name. Do I have that correct?
So it would possibly improve retrieval of records by doing:
select someColumn from MyTable where id = 4
But it would do nothing for this query:
select someColumn from MyTable where name = 'test'
Yes, you are right. But in case when you have table with many columns:
A
B
C
D
..
F
where your primary key index is for example (A), if you have second index like (B,C), the engine may decide to use it if you are using query like this:
CREATE TABLE dbo.StackOverflow
(
A INT
,B INT
,C INT
,D INT
,E INT
,PRIMARY KEY (A)
,CONSTRAINT IX UNIQUE(B,C)
)
SELECT A
,C
FROM dbo.StackOverflow
WHERE C = 0;
So, if an index can be used as covering, the engine may use it even you are not interested in some of the columns, because it will decide that less work (less reads) will be performed.
In your case, it seems that a PK on id column is a better choice with combination with second index on the name column.

When I retrieve a non-indexed column from a large table, SQL Server looks up the PK

I have a huge table which has millions of rows:
create table BigTable
(
id bigint IDENTITY(1,1) NOT NULL,
value float not null,
vid char(1) NOT NULL,
dnf char(4) NOT NULL,
rbm char(6) NOT NULL,
cnt char(2) not null,
prs char(1) not null,
...
PRIMARY KEY CLUSTERED (id ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
)
There is a non-clustered unique index on this table which is on the following columns:
vid, dnf, rbm, cnt, prs
Now I would like to get some data out of this huge table with a partial index:
insert into #t
select b.id, b.rbm, b.value
from
(select distinct x from #t1) a
join
BigTable b on b.vid = '1'
and b.dnf = '1234'
and b.rbm = a.x
where:
create table #t
(
id integer primary key,
rbm varchar(8),
val float null
)
create index tmp_idx_t on #t(rbm)
create table #t1 (rbm char(6) not null)
If I don't include the value column in the query, the execution plan will not show the PK lookup on BigTable. But if I have the value in the resultset, a PK lookup will be on BigTable. The output list of this lookup is the very column val.
But I am not using any PKs here so it takes a lot of time to finish. Is there a way to stop that PK lookup? Or am I writing wrong SQL?
I know the BigTable is not a great design but I can't change it.
In order to avoid the PK index lookup you'll need to include all the columns (filtering predicate AND result set) in the index. For example:
create index ix1 on BigTable (dnf, vid, rbm) include (value);
This index only uses the first three columns (dnf, vid, rbm) for search, and adds (value) columns as data. As pointed out by #DanGusman the id column is always present in a SQL Server secondary index. This way the index has all the data it needs to resolve the query.
Alternatively, you could use the old way:
create index ix1 on BigTable (dnf, vid, rbm, value);
This will also work, but will generate a heavier index. That being said, this heavier index may also be of use for other queries.
Your index is lacking the id column. In order to resolve the query, the engine needs to go to the data pages. I would advise you include the column in the index, so the index covers the query.
I am thinking that you might try writing the query as:
select b.id, b.rbm, b.value
from BigTable b
where b.vid = '1' and
b.dnf = '1234' and
exists (select 1 from #t1 t1 where t1.x = b.rbm);
This might encourage SQL Server to use a more optimal execution plan, even when id is not in the index.

SQL Index Update with Covering Columns

I am creating an index on a table and I want to include a covering column: messageText nvarchar(1024)
After insertion, the messageText is never updated, so it's an ideal candidate to include in a covering index to speed up lookups.
But what happens if I update other columns in same index?
Will the entire row in the index need reallocating or will just that data from the updated column be updated in the index?
Simple Example
Imaging the following table:
CREATE TABLE [Messages](
[messageID] [int] IDENTITY(1,1) NOT NULL,
[mbrIDTo] [int] NOT NULL,
[isRead] [bit] NOT NULL,
[messageText] [nvarchar](1024) NOT NULL
)
And the following Index:
CREATE NONCLUSTERED INDEX [IX_messages] ON [Messages] ( [mbrIDTo] ASC, [messageID] ASC )
INCLUDE ( [isRead], [messageText])
When we update the table:
UPDATE Messages
SET isRead = 1
WHERE (mbrIDTo = 6546)
The query plan shows that the index IX_messages is utilized and will also be updated becuase the column isRead is part of the index.
Therefore does including large text fields (such as messageText in the above) as part of a covering column in an index, impact performance when other values, in that same index, are updated?
When a row is updated in SQL Server, the entire row is deleted and a new row with the updated records is inserted. Therefore, even if the messageText field is not changing, it will still have to be re-written to the disk.
Here is a blog post from Paul Randall with a good example: http://www.sqlskills.com/blogs/paul/do-changes-to-index-keys-really-do-in-place-updates/

Index to enforce a single TRUE value per table in bit column

I have table that contains an IsDefault column:
CREATE TABLE CustomerType
(
ID int IDENTITY(1,1) NOT NULL,
Name nvarchar(50) NOT NULL,
IsDefault bit NOT NULL
)
The IsDefault value should, naturally, be TRUE for only a single row, all other rows should be FALSE. I want to enforce this rule on the database level.
Currently I achieve this by adding a new computed column and placing a UNIQUE NONCLUSTERED INDEX on it:
CREATE TABLE CustomerType
(
ID int IDENTITY(1,1) NOT NULL,
Name nvarchar(50) NULL,
IsDefault bit NOT NULL
IsDefaultConstraint AS (CASE WHEN IsDefault = 1 THEN 1 ELSE -ID END),
)
CREATE UNIQUE NONCLUSTERED INDEX UQ_CustomerType_IsDefault ON CustomerType
(
IsDefaultConstraint ASC
)
This works just fine, but has a bit of code smell to it because the extra column doesn't contain relevant data and is just used for enforcing the unique index.
Are there alternative ways to enforce the same behavior?
For SQL Server 2008 or later, use a filtered index:
CREATE UNIQUE INDEX IX_Default on CustomerType (IsDefault) WHERE IsDefault = 1
For older versions, you use the "poor man's filtered index", the indexed view:
CREATE VIEW dbo.DRI_CustomerType_Default
WITH SCHEMABINDING
AS
SELECT IsDefault FROM dbo.CustomerType WHERE IsDefault = 1
GO
CREATE UNIQUE CLUSTERED INDEX IX_Default on DRI_CustomerType_Default (IsDefault)
Unfortunately SQL-Server doesn't provide function based indexes, which is what you are looking for. So your approach is the best available.
If the additional column is too annoying, then use a view on the table hiding that column.
If this is still annoying you, switch to Oracle ;-)